DOCUMENT RESUME 



ED 081 636 



SE 016 779 



Paulson, James A. 

An Evaluation of Instructional Strategies in a Simple 
Learning Situation. 

Stanford Univ., Calif. Inst. -for Mathematical studies 
in Social Science. 

Office of Naval Research, Washington, D.C. Personnel 
and Training Research Programs Office. . 
?^:R-154-326; TR-2^^ 
30 Jul 73 1 
93p. 

MF-$0.65 HC-$3.29 

Educational IPsychology; *Instruction; *Learning; 
Learning Theories; *Mathematical Models ;>*Research; 
Teaching Methods 

Research Reports ^ ^ 




AUTHOR 
TITLE 

INSTITUTION 

SPONS AGENCY 

REPORT NO 
PUB DATE 
NOTE 

EDRS PRICE 
DESCRIPTORS 

IDENTIFIERS 
ABSTRACT 

Three different strategies of choosing items for 
presentation in a simple list learning situation are compared. The 
instructional task was to teach the correct response to a number of 
stimulus items, using a paired-associate teaching procedure. Only one 
item could be presented on a given trial and the total number of 
trials was limited. The optimization problem considered was to find 
the best strategy for deciding which item to present a subject on a 
given trial, based on his performance on previous trials. The 
optimizatiOTi criterion was the number of it^ms retained on a 
posttest. Three progressively more sophisticated models of the 
learning process, the linear model, the one-element model, and a 
forgetting model, suggested different candidates for the optimal 
procedure. This paper is devoted to the derivation of theoretical 
expressions for operating characteristics of interest for each of the 
thr^e strategies. These expressions, based on the most adequate model 

(the forgetting model) , then permit direct numerical comparisons of 
the predicted performance of the three strategies for specified 
values of the model parameter. If the material to be learned was 
relatively easy, then the theoretical differences between the 
strategies were relatively insignificant. If the material to be 
learned was difficult, then the theoretical differences between the 
strategies were quite significant. The differences depended on the 
parameter values of the forgetting model in a systematic way. 

(Author/DT) 



/ 



AN EVALUATION OF INSTRUCTIONAL STRATEGIES IN A 
SIMPLE LEARNING SITUATION 



BY 



JAMES A. PAULSOM 



us DEPARTMENT or HEAI-TM. 
EDUCATION & WELFARE 
NATlONAl. INSTITUTE OF 
EDUCATION 

\> ji 1 1) f X AC u T '• '>* ^ * F ' f n « 



TECHNICAL REPORT NO. 209 



JULY 30, 1973 



PSYCHOLOGY AND EDUCATION SERIES 



INSTITUTE FOR MATHEMATSCAL STUDIES IN THE SOCIAL SCIENCES 

STANFORD UNIVERSITY 
STANFORD, CALIFORNIA 




FILMED FROM BEST AVAILABLE COPY 



TECHNICAL REPORTS 

PSYCHOLOGY SERIES 
INSTITUTE FOR MATHEMATICAL STUDIES >N THE SOCl.A'. SCIENCES 

(Place of publication shown :n parentheses; If published title 1$ dliferent fro«r lUie of Technical Report, 
• . this is arso shown in parenthesos.) 

(For reports i>o. \ -44^ see TechnTral Rapcrt n^. |?5.J ^ 

50 R. C. Alkinton and R. C. Calfec. Maiheinal^cat learnlrrg t^^ory. January 2, 1963. (In B. B. Wolmtn (Ed.), Scientific Pr fchology. Ntw York: 
Sastcflooks, Inc., 1965. Pp. 254-275) 

51 r\ Suppes, €. Crothen, ana R. Weir. Appllcailon of malhefnatieal fevning theory and linguistic enalysU lo vowel phorwme ^(ihlng In 
RuiSlan w&'di. December 28, 1962. 4 

52 R. C* Alkinsonf R, Calfee, d Sonrner , W. Jeflr^ tnd R. Shoemakef. A lest of \\t%h modtft for itlmuln cccnpotjT^ding with children. 
January 29, 1963. O. eig. Paychol. , 1964^47, 52-58) 

53 E'. Crolhers, General Mvlcov models for leamliJwlth Inter-trltl Forgetting. Aprir 8, 1961. 

54 J. L. li^eri and R. C. Atkinson! Choice bef^or«ml reward itnictom. May 24, 196^. U<wrnij math. Psychol ., 1964, t70''»203> 

55 • R. E. Rohtnson. A setHheoretlcal appmech to emptrtcal tneanlngfulneit of measwenvnt sUtW atnfci, June 10, 1963. 

56 E. Crothers, R. Weir and P. Palrrcr. The rote ol trtnscrlpKon In the teaming ol the ortho^aphic re^presentaUons of Russian sounds. June 17 >. 1963. 

57 P. Siippes. Problems of opUmlzatlon in feamlnga llstof simpfe Itami. July 22, 1963. (In MsiynardW. Shelly, II andCf^'i L. Bryan (Eds.), 
Human Judgments and 'OpOwallty . New York: Wiley. 1964. Pp.116426) 

58 R. C. Atkinson and E. J. Crothefs. Theoretical note: all-or-none learning and intertrlal fdrgttUng. July 24, 1963. 

59 R. C. Calree. Long-tarm behavior of rats' under probabNistIc reinforcement schedules v Ocioler 1, 1963. 

60 R. C. Atkinson and E . J. Crothera. Tests of ac^Ultlonand retention, vclocns fot pitred-assoctaU Uamlng. October 7.5, 1963. (A compartson 
of palred-asioclate leaning models having different acqwlsltion and retention axlcm, J. math. Psychol . , 1964, l_, 265*315) 

6^ W.J. McGlfl and J. Gibbon. The genefal*9>mme distribution and reaction times. Noverrfer 20, <%3. CJ. math. P sychol . , 1965, Ij 148) 

62 M. F. Norman. IncremenUl learning on random trials. Deeeo6er 9, 1963. U. math. Psychol . J 964 , 1 ,*336-35r) 

63 P. Suppes. The development of n«thimtlc«t ccnccptt Ir< children. Februiiy 25 , 1964 . (On the bel^vlonl foundations tf mathematlcil coneepU. 
Monographs Of the Society for Research tA Child De v^topmenl , 1%5, 30, 60'%) 

64 P. Suppes. Methematlcft concetA formation In children. April 10, 1964. (Araer . Psychologist, 1966 , 21^, 139-150) 

65 R. C. Calfee, R. C. Atk1nson>andT. Sheilon, Jr. MathematlcaJ models for verbal levning. August 21,7964. (Kn N. WIenw and J. P. Schoda 
(Eds.), Cybernetics of the Nervous S^tem ; Progress In fraln Research . Amsterdafn, The Netnerlands: Elsevier Piiblishing Co., 1965. 
Pp. 333-349) • 

66 t.. Keller, U. Cole, C. J. Burke, andW. K. Est*«. Palr&d associate ttamlng with dlffmntlal rewards. August 20, 1964. (Reward and 
Information values of irial otitcomes Ki p*lred associate learning. ( Psychol . Vmogr .^ 1965, 79 < 1-21) 

67 M. F. Normari. ^^probabtflstlcmodeHorfret-rnponding, DecMnberl4, 1964. 

68 ' W. K. EtUs «nd H. A. Taylor. VUucl detection in rotation to display size tnd redundancy of critical <ifements. January 25, 1965, Revised 
7-1-65. ( Perception and Ps yfchophyslcs , 1966, 1, 9-16) 

69 P, SuppH and J. Donio. Foo^idaitlens of sttmblui-SAmpllng theory for corftlnuoMSHlme processes. FobrutiY 9, 1965. (J. math. Psychol . , 1967, 
4,202-225) 

70 R.C. Atkinson and ft. A. Klnchfa. A laamSng modelfor forced-choice detection eNperimenU. February 10, 1965. (Br. J. math sW. Psychol ., 
1965,18,184-2061! 

71 E.J.Crothen. PrtVintatl on ordert for Items firom different categories. lUlarch 10, 1965. 

72 P. Suppes, G. Groen, and M. Schlag-R(y. Some models for msponse latency In palred-risoclates learning. May 5, f965. U. math. Psychol ., 
1966,3,99-128) 

73 M. V. LvVlne. Thf ge(>orallzation function In the probabriitr learning expvriment. Juno 3, (965. 

74 D. Hansen and T. S. Kodgen. An eitptaratlon of psychollnguistic units In initial riadfng. July 6, 1965. 

75 B. C. Arnolc'/. A correlated um-teheine for a continuum of responses. July 20, 1965. 

76 C. iiawa rid W. K. Estesc f^eitilorcement -test sequence* in pah^-assKlate framing. August 1, 1965. ( Psychol . Reports , 1966, 18, 879-919) 

77 S. t. B]p/r«rt. Pattern dlscrln^natlon leming with Rhesus monkeys. Sep(en6er 1, 1965. ( Psychol . Reports , 1966, i9, 311-324) 

78 J. L. P'yitlllps and R. C. Alkl ton. The effects of display size on short-terniji^mory. August 3f, 1965. 

79 R. C. Atkinson end R. M. Shiffrln. Miihematleal models for memory an^l^ng. Septeirim 20, 1965 . 

80 P. Svppes. The psychological foundations of mathematics. October 25r^9^5. ( ColloQuei lntematlpn« a_du Centre NjTtlonal de la Reche*che 
'icientlfitiue. Editions du Centre National de la Recherche Sclenltflque. ^Ist 1967. Pp. 213-242) ^ 

81 P, Sm^P<s- Coiipuier-estisied instruction In the schools i poUntltlttleSr problems, prospects. October 29, 1965. 

82 ft. A. Kinchia, J. To#nsend, J. Yellod, Jr., and R. C. Atkinson. Inlltiinct of correleted- visual cues on auditory signal detection. 
November 2, 1965. ( Perception and PsychOf>rtysics , 1966, I, 67-73) 

83 P. Suppes, M. Jerman, and G. Groen. Artthniettc ikUls and review on a computer-based teletype. Novem6fr 5. 1965. ( Arlthmailc Teec^wr, 
April 1966, 303-309. ' • * 

84 P. Suppes and L. Hyman. Concept learning with non-veibal geometrical stInvIL NovciivVh 15, 196r. 

85 P. Holland. A variation on ll»e fnlnitnum chl-sguare lest. (J. PsycM. . 1967, 3, 377-413). 

(36 P. Suppes . Acctlaratedproqram In elemlrnLary-school mathematics — the lecond /ear. Noverrber 22, 1965. ( Psychology in t lie Schools . 1966, 
3^, 294-307) ^ 

87 P. Lorenien end F^ Sinford. Logic ei t[!.<loglcal gatie. November 29, 1965. 

88 L. Keller, W. J. Thomson, J. R. Twidy, and R. C. Atkinson. The effects of reinforcement Tnlerval on the acqulsltloi o* f alred-atsoclate 
responses. Decenter 10, 1965. (;. ejip> Psychol ., 1967, 73, 268-277) / ' 

89 J. I. YelloU, Jr. Some effects on noncontlngent success In human probability learning. December 15, 1965. 

90 P. Suppes and G. Groen. Some counting models for flrtt-yade performance diu on simple addition facts. Januz/y 14 ^ 1966. (In J. M. Scandura 
(Ed.), Resewch In M^Sthemattes Edication. Washington, D. C: NCTM, 1967. Pp. 35-43. 

91 P, Supptt. Infonneiren preotssir»| «nd choice b^vlor. January 31, 1966. 

92 C.Grotfi and R.C.Atkinson. Models for optimizing the levning pTociss. Ftbruary 11, 1966. ( P«yehol . Bulletin, 1966, 66, 309-320) 

93 R. C. Atkinson and D. Maiisen. Co<r«uterHiSSlstcd Insiruction in Initial readlrm: SUnferd project. March 17, 1966. ( Reading Research 
Qmrterty , 1966, 2, 5-25) 

94 P. Si9pes, ProbablTlstic infeitnct (he concept of totaf evidence. March 23, 1966, (In J. Ktntlkka and P. Suppes (Eds,), Aspects of 
Inductive Logic . Amsterdam: Morth-Mollind Publishing Co., 1966. Pp, 49-65. 

^ 95 P. Suppes, The axlonaUc nv»»hod in high-school mathematics. April 12, 1966. (The RoU of Axlomatlcs and Problem Solving In MaU?ematlcs . 



ERIC 



The Conference Bo«d of the Mathemstkal Sciences, Washington, D. C. GInnand Co., 1966, Pp. 69-76. 

(Continued on Ijislde back cover) 



AM EVALUATION OF INSTRUCTIONAL STRi^-TEGIES IN A 
SIMPLE LEARNING SITUATION 

...by 

James A. Paulson 



TECHNICAL REPORT KO. 209 
'July 30, 1973 



PSYCHOLOGY & EDUCATION SERIES 



ReproaUction in Whole or in Part is Permitted for Any 
Purpose of the United States Government 



This research was sponsored by the Personnel and Training 
Research Programs, Psychological Sciences Division^ Office 
of Naval Research, under Contract No* N000lif-67-A-0112-005^, 
Contract Authority Identification Number, NR No. 15i4-326. 



INSTITUTE FOR MATHEMATICAL STUDIES IN THE SOCIAL SCIENCES 
STAMFORD UNIVERSITY 
STANFORD, CALIFORNIA 



Secnrity Ctassification 



DOCUMENT CONTROL DATA -R&D 

iS-:^r»r i ty clmxsHicmtion of flf/». body ot abnttmrt mnd ind^ning snnofrntion stwi^t br mntestd when th^ ovcfali report jit clmsaHitd^ 



1 ORIGINATING »^ C * *yt T >f (Ccfportie mvthcr) 

Institute for Mathematical' studies in the 

Social Sciences - ot?nford University 
Stanford^ California 9^30!? 



2m, Rcnonr 5:ecwr»tv c a 5S! p tc a ti on 

Unclassified 



26. GROUP 



3. REPORT TITUe 

An Evaluation of Instructional Strategies In a Simple Learning Situation 



4 OESCRiPTivC NOTES (Type oi fport 9nd inclusive dutuu) 

Technical Report 



9. AUXHORiS) (Firmt nmw, middlm inltlmf. /aaf n«mej 

James A. Paulson ^ 



« PEPOn T OA TC 

30 July 1973 



CONTRACT on GRANT NO 

NOOOIJ4 -67- A-0O12-O05J4 

b. PROJEC T NO. 

< NR 15^-326 



im, TOTAL NO. OF PAGES 

81 



76. NC OF REFn 

18 



9«. ORIGINATOR'S REPORT NUMBER(S) 



Technical Report No. 209 



46. OTHER REPORT NO(S) (Any othmt numbmrm thml mmy 6« m9»tancd 
thtm rtport) 



10. OlSTRlBUTION STATEMENT . ■ ^ 

Approved for public release; distribution unlimited- 


II- SuPPL.Em£NT AR V NOTES 
13. ABSTRaC T ' 


12. SPONSORING KTtLI TARV ACTIVITY 

Personnel and Training Research Programs, 
Office of Naval Research 
Arlington, VA 22217 



This paper compares three different strategies of choosing items for presentation 
in a simple list learning situation. The instructional task is to teach the correct 
response to a number of stimulus items, using a paired-associate teaching procedure. 
Only one item can be px'esented on a given trial and the total number of trials is 
limited. The optimization problem considered is to find the best strategy for de- 
ciding which item to present a iiUbject on a given trial, based on his perfomance on 
previous trials. The optimization criterion is the number of items retained on a 
posttest. Three progressively more sophisticated models of the learning process, 
'the linear model, the one ^- element model , and a forgetting model, suggest different 
candidates for the optimal procedure- The paper is devoted to the derivation of 
theoretical e^qpressions for operating characteristics of interest for each of the 
three strategies- These expressions, based on the most adequate model, the forgetting 
model, then pemit direct numerical comparisons of the predicted performance of the 
thKe strategies for specified values of the model parameter. If the material to be 
learned is relatively easy, then the theoretical differences between the strategies 
are relatively insignificant. If the material to be learned is difficult, then the 
theoretical differences between the strategies are quite significant. The differences 
depend on the parameter values of the forgetting model in a systematic way. 



ERIC ..1473 " 

Stfcuritv Classification 



Security Classification 



KeV WORDS 



Instructional Strategies 
Optimization of Instruction 
Mathematical Models of Learning 
Computer- as sis ted Instruction (CAI) 



ERIO .''°o'r..l473 «BACK) 



\ r f-iv?E 2) 



ROLE WT 



ROLE WT 



ROLE WT 



Security Classification 



IITORODUCTICN 

One obvious aim of educational psychology is to seek optimal teach- 
ing BtratGgies for certain recurring instructional situations^ based on 
knowledge of the learning process involved. Undoubtedly^ this aim is 
implicit in ntdst of the experimental vork in this area. It is only 
recently^ however, that there have been serious efforts at fonnal deri- 
vation of teaching strategies from descriptiv^>^models of learning 
processes. There are a number of good reasons -Jthy formal study of this 
problem has been neglected. Before formal deriration of a strategy can 
begin, an explicit, descriptively adequate model of the learning process 
unOer consideration must exist. Such models have been developed only in 
the last twenty years for even the simplest learning situations o Given 
an adequate descriptive framework, it is still necessary to formulate 
the optimization problem in terras amenable to mathematical: analysis. 
The developments in sequential decision theory and mathematical program- 
ming (which now make such analyses feasible; have all occurred since 
19^5* Finally^ formal optimization questions were completely academic 
prior to~the"'development of modem computer technology. The large amount 
of record keeping and simple calculation which must be accomplished in 
brief time periods in order to implement optimal procedures effectiveD.y 
limits the use of these procedures to computer-assisted instruction 
settings. Now that compute r« assisted instruction is becoming more wide- 
spread, optimization questions assume practical importance* This paper 



ERIC 



is intended as a .contribution to the study of an interesting optimization 
problem in a learning task which commonly occurs in instruction. 

If appropriate mathematical tools are to be brought to bear on an 
optimization problem^ it is necessary to place some rather severe re- 
strictions on the nature of the learning situation to be considered. 
The present study is limited to situations in which the task is to teach 
the correct responses to a number of stimulus items, using a paired- 
associate teaching procedure* It is assiimed that the items are learned 
independently^ in the sense that the difficulty of learning an unknown 
item does not depend on whether or not other items are known. Only one 
item can be presented on a given trial and the total number of trials 
is limited. The optimization problem to be considered is to find the 
best strategy for deciding which item to present a subject on a given 
trial, based on his performance on previous trials* 

There are two principal reasons for concentrating on the item- 
selection problem for paired-associate learning instead of considering 
other learning paradigms which can be described reasonably well by exist- 
ing models, e«g.^ simple cue learning- First, the optimization problem 
for the paired-associate case has received a fair amount of both 
theoretical and empirical attention and the direction in which more 
s.tudy is needed is fairly clear. Second, this paradigm is directly rele- 
vant to some practical learning tasks, such as drill activities used in 
the learning of vocabulary items in second-language learning, the acqui- 
sition of a Eight vocabulary and a knowledge of phonics in initial reading, 
and the mastery of spelling. 



ERLC 



2 



T wo Strategies from Two Models 

In sulise^ugjii— chapters , three different strategies for choosing 
items to present will be examined in some detail. The fli^st two strat- 
egies are based directly on corresponding s.imple models of the learning 
process • The third strategy is also motivated by model considerations, 
but the connection between model and strategy is not as direct in this 
. case. 

The first strategy may be described as follows. On a given trial, 
present the item which has received the fewest presentations up to that 
point. If more than one item satisfies this criterion, select the item 
at random from the set satisfying the criterion. Upon examination, this 
strategy is seen to be e.quivalent to the standard cyclic presentation 
procedure commonly employed in experiments on paired^associate learning. 
It amounts to presenting all items once, randomly reordering them, prs.- 
senting them again and repeating ,the procedure until the number of trials 
allocated to instruction have been exhausted. This procedure will be 
referred to hereafter as the RC (for random cyclic) procedure or strategy. 

A ra"':ionale for the RC procedure can be provided by a simple linear 
model of the learning process. In this model, the state of the learner 
with' respect to each item in^.a.list depends only on the number of times 
each item has been presented. The state of the learner is represented 
by his momentary probability of error for each item. At the start of 
instruction, all items have seme initial probability of error, say q^; 
each time an item is presented, its error probability is reduced by a 
factor ay which is less than one. That is. 



ERIC 



3 



or alternatively 



^n-rl n 



^n+j. ^1 



When an item is presented for the n"^^ time, rhe reduction in error proba- 



bility is given by 



The fact that the decrement in error probability for an item becomes 
smaller each time it is presented leads naturally to the RC procedure. 

The second strategy is more complicated tc describe, but the essen- 
tial idea is very simple; ignore responses prior tc the last error on an 
item; present the item which has received the fewest correct responses 
since its last error. IT more tl^n one item is eli^jible according to 
this rule^ select the item to be presented at random from the set of 
eligible items Karusii and Dear <iy66) proved that this strategy is 
optimal if the assumption that learning proceeds according to the 8C- 
called one - element model is valid. The strategy is optimal in the sense 
that iv mtiximizes the expected number of items learned in a fixed number 
of presentations. For this reason, this strategy will be referred to 
hereafter as the OEM strategy. 

ecrdln;^^ to the one-element model a student is in one of two states 

with respect to each item at any given point in time; the learned state 

\ 

or the uniecirned state. When an unlearned item is presented, it moves 
into the learned state with probability c. That is. 



ERLC 




with probability 1-c 



with probability c . 



Once an item is learned, it remains in the learned state throughout the 



course of instruction, so there is no reason to present the item again. 
A subject may respond correctly to an item even though the item is in 
the unlearned state and the subject is guessing. In effect, the OEM 
strategy selects items for presentation that are most likely to be in 
the unlearned state. 

The linear n.odel, which provides a basis for the RC procedure, and 
the one-element model, which provides a basis for the OEM procedure, are 
the simplest models for paired*associate learning having any empirical 
support* On the whole, the one-element model gives a better account of 
data from experiments using the RC procedure, than the linear model; see, 
for example. Bower (l96l). The data which lead to v is conclusion would 
also lead one to believe that a given number of presentations allocated 
to a list of items using the CM procedure would produce significantly 
better results than the same number of presentations allocated according 
to the RC procedure. The predicted advantage for the OEM strategy often 



^>fails to materialize, unless special modifications are made in the OEM 
procedure • This anomaly provides the motivation for the developments 
to^ be reported in subsequent chapters* 

Atkinson and Crothers (196^) reported data comparing performance 
of several models of learning and retention which suggests consideration 
should be given to procedures based on models taking forgetting phenomena 
into account* However, it turns out that the performance of procedures 



\ 



ERIC 



5 



based directly on forgetting models is difficult to characteriiie in a 
general way* The tvo strategies described above are special in two re- 
spects^ which make the relationship between model and strategy simpler 

f 

for them than it is in generals One special faature both the OE^l/'and 



4ci t 



RC procedure possess is that they maximize both immediate gain iti proba- 
bility of correct response and global gain over the cou'rse of "^^^^^ 



experiment considered as a whole. I^ is the exception rather than the 



mile for a procedure to be capable of maximi2ing both of these quantities. 
Another r^ajj^j rare property which these procedui'es have in common is 
that Implementation of neither one depends on the parameter values of 
the model on which it is based « 

The approach to be taken in this paper is to use a general theory 
of learning and forgetting to describe performance of three strategies 
the two already mentioned and a third hypothetical strategy, This hypo- 
thetical strategy is a modification of the ORM strategy which would be 
optimal under the assumption of the general' forgetting model if it could 
be carried out. For reasons which will be clear when the strategy Us 
described in detail later^ the strategy would be impossible to implement. 
Nevertheless^ the strategy serves as a useful bound against which im- 
plementable strategies can be compared.. 

The r-est of this paper is devoted to consideration of the relative 
performance of t}ie three strategies discussed above ^ using the general 
forgetting theory as e model framework. The forgetting theory and the 
reasons for adopting it for this study are described in Chapter II; this 
chapter also contains several counterexamples which demonstrate the 
infeasibility of dealing with globally optimal strategies in the context 

EMC ^ 



of the forgetting theory framework. Because it is not feasible to deal 
with globally optimal sti^ategies^. it is necessary to take a descriptive 
approach to the evaluation of presentation strategies • Chapter III is 
devoted to the definition and derivation of formulas for operating charac- 
teristics of the three strategies described above. These formulas are 
then used in Chapter IV to make numerical comparisons of the strategies 
/or selected special cases within the general framework of forgetting 
theozy» In Chapter V the conclusions of the study ar^.' s\immarized and 
implications for future research are considered. The results presented 
in this paper are new^ at least to the author's knowledge^ unless other- 
wise; noted. 



CEAPTER II 

A GENERAL FOPTrETTIT^G THEORY AND ITS IMPLICATIONS 
FOR PRESENTATION STRA.TEGIES 

The General Forgetting Theory proposed by Rumelhart (hereafter to 
be called the GET) is a synthesis of several models vhich represent dif- 
ferent-ways to generalize "the OEI*^. These models retain much of the 
simple all-or-none character of the 0EI4 while introducing the factor of 
forgetting^ which the OEM, does not take into account. The theory is 
relevant to the problem being- treated here because the phenomena of for- 
getting could work in a variety of ways to undermine the OEM strategy. 

There is considerable evidence now that forgetting works in such a 
way that the OEI^ strategy is not as effective as it theoretically should 
be. ''In an experiment designed to. test the advantage of the GEM procedure 
over the standard RC procedure reported by Dear^ Silberman^ Estavan and 
Atkinson (1967)^ the advantage predicted for the OEM was not observed. • 
Experimental findings of Hellyer (I962) and Greeno {196k) ^ among others^, 
suggest that when items are presented repeatedly within a short period 
of time (as they often are under the OKM procedure)^ many are responded 
to correctly on the massed presentations^ but are then rapidly forgotten. 
This interpretation is consistent with that of Dear and his associates. 
Experiments which have shown the predicted advantage for the OEM pro- 
cedure^ such as one by Lorton reported in Atkinson and Paulson (1972) ^ 
have modified the procedure to minimize the number of massed presentations. 
The General Forgetting Theory 

The GET can be described as follows • At any given time^ a subject 
is in one of three possible states of learning with respect to each item: 

ERLC 8 



the unlearned state the short-term retention state, or the Icng-^-lerm 
retention state. When an item is. presented^ transitions between states 
occur according to the following stochastic matrix. * ■ ' • 



State on ^ 
trial t 

. U 



State on trial t+1 

L S U 

1 0 0 

c l-c 0 

a b l-a-b 



"Probability of correct 
ivsponse^ given the state 

Y 



1 



;,LeJ 



That is:/tp 3ay\ if an unlearned item is ^resented^ then with proba- , 
bility a it is learned in such a way^ that it will be /retained for a / 
relatively long time^ with probability ,b it is learned in such a way that 
it is likely 'to. be -forgotten soon^ and with probability 1-a-b it remains:;,.,/ 
unlearned/ If an item* in the short-term retention state is presented^ 
'■"then -with probability ^ c- it will shift to'^'THe long-tenn retention state, 
and with probability l-.*^ it will iremain in the short-term state. When- 
ever ah item reaches the learned state, it remaj.ris there for the duration 
of the experiment, - ' 

When an item in either the long- or short-term state is presented, 

c- ■ ■ 

ttie correct resppuse j.s^ given with probability 1/ If an unlearned item 

is presented,, the corz'ect response ^ given with the guessing probability g. 

In this extended model it is necessary to consider what happens to 
items which are not presented on a trials Transitions' between states 
occur according to the matrix 



ERIC 



state on trial t+1 



State on 
trial t 





L 


3 


U 


L 


~1 


0 


o' 


S 


0 


1-f 


f 


U 


0 


0 


1 



That is^ items in the short-term retention state are forgotten, with 
probability f; while items in the lon^.-teim retention state or the un- 
learned state are unaffected. 

If it is stipulated that the parameter b in the learning matrix is 
0 

0, then an item can newr. enter the short-term state so -^he model reduces 

, >^ 

to the OEM in this case- 

Perhapsithe quickest way to follow the character of the GFT frame- 
work is to^insider briefly of the other models which it encompasses 
as special cases. Table r„.l is in':ended to give the reader an abbreviated 
natural history of the rormulatio.i just given. The various special cases 
all assume a forgetting matrix of the foxin given above. They differ 
with respect to the fom of the learning matrix. The differences reflect 
differences in assumptions regarding two separate issues. 

The earlier models all assume th^t some learning takes place when- 
ever an item in the unlearned state is presented. Hi^nce, the probability 
of staying in the unlearned state, is 0. The models differ regarding the 
relative size of the probability of transition to the long-term state 
from the unlearned and short-term states, respectively. The model of 
Atkinson and Crothers (196^) assuine.s that these transition probabilities 
are equal. The model of Greeno (1^?6^). assumes that transitions to the 

Er|c 10 



Table 2.1 

Transition Matrices of the Learning Process for 
Special Cases of the General Forgetting Theory 







'state 


on trial 


K+1 


Comments 






L 


S 


U 






L 


~1 


0 


0~ 


Ono of the original 




S 


a. 


1-a 


0 


versions of the long- 
short mode] due to 




u 


a 


1-a 


0_ 


Atkinson & Crothers (196^ 






~1 


\j 






• 




0 




Q 


A coding model due to 




s 


1 


Greeno (19oU) 




u 


a 


1-a 


0_ 






r. 


^1 
J- 


n 


oH 

: 


A partial .learning 


state CD 


s 


a 


1-a 




. model • due to' rnbaeh 


L> X X CL X Vi 


u 




1 


0 


(1965) 




L 




0 


0 






■S 


b 


1-b 


0 


A general model encom- 
passing the three above ' 




u 


a 


1-a 


0_ 






L 


"l 


0 




A further extension 




S 


by 


l-b7 


0 


introducing an attention 
parameter 7, due to 
Rumelhart (1967) 




u 


_a7 


{l-a)7 






L 


. 


0 








S 


c 


« 

1-c 


0 


The formulation given 
in this paper 




U 


a 


b 1- 


.a-b_ 





ERIC 



11 



/ 



• ■■ ■ • • Jl ' •' • ■ 

long-term state, can ov.ly t^^ke place' froitT- tlie -imlearned state ^ ai';d the 

■ ■ ■ ' )■ 
model of Bernbach (I965) assumevi that these transitions can only occur 

I ■ ^ 

from the short-term state. These three models all have learning matrices 
depending on a single parameter* If one vishes to leave the is§ue of the 
relative size of these transition parameters open, he can do so at the 
price of introducing a second learning parameter, as indicated in the 
fourth transition matrix in Table 2.1. . * 

In formulating his GFT, Rumelhart leaves the issue of the relative 
size of the transition probabilities open^ and introduces a third param- 
eter 7, which he regards as an attention parameter. If 7 is less than 1, 
there is positive probabi];^ty that an item in the unlearned state will 
stay there following a presentation. The final matrix, which corresponds 
to the one given above, is- a very slight generalization of Rumelhart 
formulation. Combinations- of a, b, and c in the final foimulation for 
which c > a-hb do not correspond to any possible combination of a, b, and. 
7 in' the fifth matrix. The cases of most concern in this paper satisfy 
the constraint c < a, so the difference is essentially one of. notational 
convenience. According to Rumelhart, the introduction of 7 results in 
a marked improvement in the fit of the model to his data. 

Now let us consider the implications of the GFT framework for pre- 
sentation strategies. It is well-known that the strategy which maximizes 
immediate gain in probability of correct response can differ from the 
strategy which maximizes the global gain over the course of the experi- 
ment as a whole. The latter type of strategy is called globally optimal . 
The rest of this chapter presents findings which together demonstrate 
the need to leave the search for globally optimal strategies in favor of 

er|c 12 



a detailed description of operating characteristics of certain selected 
h:trategies. The crux of the argument is that the globally optimal 
strategy requires looking more than one trial ahead in all cases of in- 
terest; in the context of the general forgettj,.ng theory this fact alone 
makes the globally optime.l strategy very difficult to characterize in 
a useful way. 

It Is unfortunate that the strategy that looks just one trial ahead 
li> not globally optimal, because this strategy is mathematically simple 
and intuitively reasonable. There ere very clear interpretations, of this 
strategj;_jja- each of the. special^ cases of the GFT framework described 
above* Each of these is a plausible generalization of or alternative to 
the optimal strategy corresponding to the one-element model. But counter- 
examples vill be provided to show that none of these are globally optimal. 
Strategies Maximizing Immediate Gain 

Let , s, . u. 'oe the respective probabilities that item i 

i,n^ i^n" i^n ^ 

is" in the long-terra^ short-tenn^ or unlearned state on tri^l n. Let 

6. be an indicator variable which is 1 if item 1 is presented on trial 

0 if it is not. The probability that item i is in the long-term re- 
tentior, state on trial n+1 is given by 

t, +6. (cs_, + au. ) • . ' 

i i^n i^n i,n i,n' 

The e^qpected gain in number of items. In the long-term state on trxal 

n+1 is given by 

. I I I • 

F ^. - Z ^. - r (cs, + au, ) . 

t 



JC ^3 



Clearly, the expected gain is maximized If we present the items with 

largest values of cs. + au. 

i,n i^n 

In the special case a = c this amounts to presenting the items with 

the largest values of s_, + » or the smallest value of ^. . Hence, 

i^n i^n^ i^n ^ 

in this case the strategy maximizing the immediate gain is a generaliza- 
tion of the one-element model strategy* 

In the special case c = 0 the expected gain is maximized by present- 
ing the items most likely to be in the unlearned state ^ which is a dif- 
ferent generalization of the OEM strategy. In the case a = 0 immediate 
gain is maximized by presenting the items most likely to be in the short- 
term state^ These comments are summarized as . a theoran for future reference. 
Theorem 2.1 . Let s^, and. u^ be the state probabilities for item i 
on a given trial, and let a and c be the transition probabilities for 
moving to the state L from state U and respectively. Then the expected 
immediate gain is maximized by presenting the items with largest values 
of 

G. = au. + cs . . 

1 1 X 

In the case a = c, this is equivalent to presenting the items least 
likely to be in L. In the case a - Qy it means presenting the items 
most like ly to be in S. In the case c = 0, it means presenting the 
items most likely to be in 

Counterexamples to Demonstrate Non-optimality of Maximizing Immediate 
Gain 

When Karush and Dear (1966) established the global optimality of 
the OEM strategy^, they did so by first deriving the strategy maximizing 

Ik 



the immediate gain and then showing by an induction argument that this 
strategy is^ in fact, globally optimal. Their approach to the character- 
ization of optimal strategies will not carry over into the GFT framework, 
as the countertjxainples to be presented will show. A counterexample will 
be described for each of the special cases mentioned in Theorem 2.I. In 
each of these cases, the strategy maximizing immediate gain focuses all 
attention on a single state probability, ignoring the other two»(r The 
thrust of the counterexample in each case is to show, that the other two 
state probabilities carry important information. 

Case 1: a = 0 . It is perhaps easiest to see the necessity for look- 
ing ahead more than one stage by examining the special case in which each 
item must pass through the shoxt-tem state before it can reach the long- 
term state. That is, the probability a of making a direct transition 
from the unlearned state to the long-term state equals 0- In this 
specj.al case^ the policy maximizing immediate gain is to present the 
item most likely to be in S- If all items start out in the unlearned 
■state, all have probability zero of being in S- After the first item 
Is presented, it has positive pz^obability of beiilg in S and will continue 
to have for the duration of the experiment. Since the other items would 
still ha^.-e probability zero of being in the policy maximizing immed- 
iate gain would be to continue to present the first item indefinitely. 
There is no immediate gain to be had from presenting a new item once, 
because it will Just go to the short-term state, but there may be con- 
siderable advantage in presenting it twice. The strategy maximizing 
immediate gain ignores this possibility. 



15 



Suppose A and B are tvo unknown items and four trials are available 
for teaching both of ^hem» Suppose the parameter values are a = 0, 
b=l^ c = f= ,5 for both items. The strategy maximizing immediate 
gain would devote all four trials to one item. It is easy to verify 
that a better strategy would be to present each item twice in succession. 
**This example is logically sufficient to prove that the strategy maximizing 
immediate gain is not in general the globally optimal policy in the GFT 
framework. It is still possible that such strategies are globally op- 
timal' for some other special cases. Two more, examples will be given to 
show that thib is not the case in the instances of most interest in the 
present study. 

Case 2: a t= c . It was shown earlier that when a equals the ^ 
strategy maximizing the immediate gain is to present the items least 
likely to be in L. Suppose two items, A and are not in L. Suppose 
A is in state U and B is in state S. Which item should be presented on 
the next trial? From the point of view of immediate gain it makes no 
difference, because the probability of either item making the transition 
to state L is the same; that is, a =^ c. However, from a longer tem 
point of view it does make a difference. If item B is presented, it will 
bs responded to correctly; if item A is presented, it will likely be re- 
sponded to incorrectly. The incorrect response would be informative, 
letting the experimenter know that the item was certainly not in L before 
its presentation. Thus, it would bejpreferable to present item A. 

The preceding argument is not completely satisfactory because it is 
assumed that both A and B are not in L, so presenting A is not as infor- 
mative as it seems. But the argument would apply to a case where A and 

Er|c 16 



B have the same positive probability of being in L butV is more likely 
to be in U than is B% This situation is likely to arise in practice. 
For example^ consider the case where a = b=:c-f=g=.5. The se- 
quence of events given in Figure 2,1 consists of an initial phase and 
two altez'native strategies for a second phase. At the end of the initial 
phase both items have probability about .79 of being in item B hks 
^probability 0 of being in vhile i tem A has probability .10 of being 
in U. The strategy maximizing immediate gain would be indifferent with 
regaled to which of the two alternatives to follow. Direct numerical 
calculations demonstrate that it would be preferable to present A first. 

Case 3; c - 0 , Table 2*2 gives a sequence of events to show the 
need for looking more than one trial ahead in the case c = 0^ a = b = .25, 
f = g = -SO' (-t is still a two-item list under consideration.) When 
c - 0, the strategy maximising immediate gain is to present the items 
most likely to be in U. The idea behind this example is a simple one: 
It can be advantageous to refrain from presenting an item^ even if it is 
the one most likely to be in U, if waiting will significantly increase 
the probability of being in U. It is necessary that there be another 
item available to present whose prospect for immediate gain is nearly 
as. good. 

This example also consists of two phases, the first phase showing 
how two items could come to have certain critical state probabilities 
under the policy of maximizing immediate gain, the second phase showing 
the advantage of looking t'^o stages ahead instead of one^ given these 
state probabilities* At the end of the initial phase the state proba- 
bility vectors for items A and B are ^, ^) and {'^, -j^, ), 

:RIC 



c 

■SI 

o 
c 

CT 





INCORRECT 
RESPONSE 




PRESENT A 




CORRECT 
RESPONSE 







CT 


w 




CO 




z 




o 


o 


Oh 


u 


CO 


z 




H 





bD « 
C LTN 
•H 

tS3 (I 

S II 
■r* «M 

^ a 

o cd 

o (U 

H > 
O 

c 



4-> 
O 

(0 



o 
s: 
w 

w 

c 

> 

o 

(U 4-> 
O CO 
C -H 

CT g 
CO- -H 



CO 




ERIC 



18 



Table 2.2 

Sequences of Events Showing Policy of Maximizing Immediate 
Gain is Suboptimal When a=:b = .25^ 0 = 0, f=g=-i. 



Initial phase under 
policy maximizing 

immediate gain: A+/b+/A+/B-/&i-/AV continuation 

Continuation under 

MIG policy: A +/B 

Better continuation: B +/A 

Note: Letter indicates item presented; the sign following 

the letter indicates the correctness of the response^ 
where + means that the correctness does not influence 
the decision regarding which item to present next. 



ERIC 



19 



respectively. Since — > the policy maximizing immediate g^.? n would 
be to present A next. If A io pi^eseutecl , the state probabilities on the 
next trial will be such that item B vill be presented on the final trial, 
whether the response to A is correct or noto Similarly, if itern B is 
presented first, item A should be presented on the final trial. Direct 
calculations show that the latter policy is slightly preferable » 

These examples show that globally optimal strategies in the GFT 
framewQX'k generally require morrj than maximization of immediate gaino 
They do not show that the stz^ategy maximising immediate gam is never 
optimalc It obviously is in the special case where the model reduces 
to the OEM^ when b 0. Even if b is positive^ if it is sufficiently 
small, it will have no beari.ag on the optimal strategy. The implications 
of the counterexamples given above concern what can be said in general 
about globally optimal strategies without specifying the exact values 
of the parameters. The fact that we can say very little suggests that 
a descriptive approach permitting comparison of strategies witn one 
another, but not wich the globally optimal strategy, would be appropriate- 



ERIC 



20 



CHAPTEB III 
OPERATING CHARACTERISTICS OF THREE 
PRESENTATION STRATEGIES 

A reasonable model of learning should enable one to make a variety 
of predictions about the overall state of a list of items ^ provided the 
items are presented in a certain way.> The presentation procedure used 
most often in the evaluation of .learning models is the RC procedure dis- 
cussed earlier. When thiis procedure is used, sample statistics corres- 
ponding to expressions for the trial of last error, the probability of 
error given the last response to the item was an error, and other 
descriptive statistics of interest can be calculated and compared with 
theoretical predictions. 

Matters become more complicated when presentation procedures other 
than the RC procedure are employed* For one thing, the meaning of the 
descriptive statistics which are of interest may change when other 
procedures, such as the OEM strategy, are used. For example, under the 
OEM strategy the nmber of presentations varies widely from one item to 
another, so the "trial of last error" means something different than it 
does under the RC strategy. Another difficulty which arises concerns 
the derivation of theoretical expressions for statistics of interest- 
Indeed, it is only in exceptional ca.ses that it is possible to derive 
explicit expressions for quantities of interest. Usually, the number 
of times each item is presented, and when, is subject to such a variety 
of contingencies that explicit calculations are not feasible. As an 
illustration, consider the case of the strategy maximizing immediate 

ERLC 



gain vithin the GFT framework. The item to be presented on a given trial 
is the item having the highe^^t value on an index which is a function of 
the parameter values, the number cf correct responses since the last 
error^ and the number of items intervening between each of these correct 
responses. Direct calculation o'f exact theoretical fomulas of interest 
in this situation appears to be hopeless. 

The situation is r^.ct as bleak as this in the case of th'* OEM strat- 
egy because there is a pattern to presentations under this procedure 
which serves as a natural basis for summarizing the overall state of the 
items- This pattern w:j11 be described in some detail^ because it serves 
as a basis ,for mcst of the theoretical derivations in this chapter. 

Presentation cycles and ''almoGt" sufficient histories . Under the 
OEM procedure items are presented In a series of cycles which are similar 
in many respects to trials under the RC procedure. Each item receives a 
specified treatment on each cycle. The difference between the RC and the 
OEM procedures, li.ss in the fact that under the OEM procedure the treat- 
ment of an item may involve several presentations^ whereas under the RC 
procedure treatment consists of a single presentation per item per trial. 

The cyclic structure of item presentations under the OEM strategy 
arises in the following manner^ The strategy says to present an item 
whene\er the string of consecutive correct responses to it is shorter 
than the corresponding strings for the other items* If several items 
are tied at a given pointy the .choice is made on a random basis • At the 
^ginning of a cycle this index is the same for all items. If an item 
is presented and receives a correct response, its index is incremented 



22 



by 1 and is therefore greater than the indices for the other items. It 
will not be eligible for presentation again until all the other items 
reach the same levels i.e.^ until the cycle has been completed for all 
the other items. If, on the other hand, the item is responded to in- 
correctly, its index is reset to 0, so it is lower than all other items 
and will continue to be lover until repeated presentations bring it back 
to their level. \ 

Denote by cycle n those presentations required to move the list 
from the place where all items have been responded to correctly n-1 
times in a row to the place where they have all been responded to cor- 
rectly n times in a row. Most of the operating characteristics of 
interest in describing performance under the OEM procedure, such as 
cycle of last error, probability of error, and cumulative number of 
presentations, are functions of the cycle number. 

When the OEM is an accurate descirlption of the learning process, 
the cycle number is a sufficient history for describing the state of a 
list of items because for every item in the list the cycle ni-utiber is 
equal to the number of correct responses since the last error. The lag 
between successive presentations of a given Item is irrelevant. If for- 
gettiiig is taken into account, as it is in the GFT framework being 
considered here, then the lags become an important factor. Strictly 
speaking, given a GFT model a sufficient history for each item involves 
the number of correct responses since the last error and the number of 
intervening items presented between each of these correct responses. 
Fortunately, it is possible to simplify this sufficient history in the 
case of the OEM procedure with negligible loss of information^ 



The following obser^''ations are the Justification for the simplifica- 
tion of the sufficient history of an item which will be referred to as 
an almost sufficient history : 

1* When an item is presented on a cycle and the response is cor- 
rect on the first try, the item is not presented again on that 
cycle, so it is safe to assume that many intervening items will 
be presented "before that item is presented again. 
2, When an item is pre sented on a given cycle and the response is . 
an error, tnere follows a string of presentations the item 
without any intervening items, culminating in a string of pre- 
sentations with correct responses whose length is 1 less than 
the cycle number. The last correct response is made after a 
number of intervening items have been presented. 
As a consequence of these features of the OEM procedui^e, the state 
of an item at a given point in the instructional process is essentially 
determined by its cycle number and the cycle of last error- The string 
of correct responses on the cycle of last error has lag 0 between each 
presentation except for the last presentation. It and the correct re- 
sponses on subsequent cycles have what may, for practical purposes, be 
regarded as infinite lag. Therefore, s in Q for these presentations, 
where s is the probability of the item being in the short-tem state. 
Thus, the cycle number indicates the number o;f consecutive responses to 
an item and the cycle of last error indicates the lengths of an initial 
block, of presentations with no intervening items and a final block of 
presentations with many items intervening between them. 



21* 



EJy conditioning on the cycle number and the cycle of last error, 
it is possible to calculate approximate -cheoretical expressions for a 
number of statistics of interest when this presentation procedure is 
employed. These calculations will be carried out in the next ?:ecticn- 
Subsequent sections will discuss corresponding expressions for other 
procedures. The other procedures to be treated include the RC procedure 
and the hypothetical procedure that serves as a baseline for comparisons 
to be made in the next chapter. This hypothetical strategy will be re- 
ferred to as the modified OEM procedure ^ This' is the procedure that 
would result if one were somehow able to introduce very long lags 
between the several presentations of an item which has been responded 
to incorrectly on a given cycle. This hypothetical procedure is better 
than any procedure that can really be carried out, so it serves as a 
useful bound in determining how close a suboptimal procedure is to being 
optimal. It serves this purpose in the place of the optimal strategy^ 
whose operat'Ang characteristics cannot be determined in practice because 
the strategy itself is unknown. 

Operating Characteristics of the OEM Strate^ 
The basic statistic for describing performance under the OEM strat- 
egy is the expected number of presentations per item required for each 
cycle. Every item receives exactly one presentation on the first cycle^ 
but on subsequent .cycles the number of presentations is a random variable. 
Define three randojti variables for ejach cycle k > 2 as follows. Let ' 
= number of presentations required for an item to complete 
cycle 



ERLC 



25 



W, = number of presentations following an error -requi red to 

obtain a sequence of k consecutive correct responses, and, 
jt = the probability of -at least one error on cycle k for a given 
item (the reason for the double subscript will become clear 
later). 



Now 



1 y with probability 

1+W , with probability jr, , . 



Therefore, we have 

^\ = 1 ^ \,k^\ • 

The key task of this section is to find jt, , and EW . The main ideas 
to be used in accomplishing this task apply to a broader class of models 
than the GFT framework, so they will be set forth in some generality. 
Then specific approximations will be obtained for the GFT framework. 
General Formalas 

The distributio n of W, * The crucial fact to note about W is that 
it is the waiting time in a terminating renewal process, in the sense 
that Feller (I969, p. I86) defines the term. A renewal process is a 
stochastic process whose chax'acteristlc feature is that there is an 
event which sets the process' back to its starting point whenever i^ 
occurs. Such an event is called a recurrent event. We may regard 
as the waiting time for the first occurrence of a sequence of k con- 
secutive correct responses following an error. The occurrence of an 
error is the recurrent event which resets the probabilistic structure 



of the process. Renewal theory provides a fundamental relationship be- 
tween the distribution of Wy and the distoributiou of , the waitirig 
time for the next error. Actually^ the distribution of E is defective 
because there is positive probability that the cycle will terminate before 
there is another error on the cycle (hence, the term terminating renewal 
process). For this reason^ we also consider the conditional distribution 
of E^^ given that there is another error on the cycle. 



Let 

P(W^ n) ^ w 
k ^ k^n 

and let 



P(k = v) = e, for V = l.,*c-k , 

k k^v J 9 f 

and 

P(process terminates without an error) = e • 

K ^K+1 

The conditional distribution of given that another error occurs on 

K. 

the cycle ^ is given by 



P(E^ = v|e^ - v' ^ for some v' *= l>opp^k) = y 



^k,k+l 



Let the conditional random variable be denoted by E*. The generating 

function. for the distribution of Ef is then 

k 



1 ^ V 

\ \.k4-l V=l ^>"' 



Let g^-(s) = ^ ^ be the generating functd.on of W . 
k n=0 ''^ 



o 27 
ERIC 



Theorem 3*1 » Consider any model for which the occurrence of an error on 
cycle k is a terminating recurrent event. Then conclusions A, B, and C 
below follow o 

A. The distribution of is given by 



(2) 



k^n <i 



■'k^k+l 



for n < k 



for n k 



T Mf' V for n > k • 

v=l ^ ^ 



Ba The generating function of W is given ly 



(3) 



/ V k,k+l 

k ^ k,k+l'^Ef^ ' 



C. The expected value of W is given by 



(M 



= ^ k+1^ 



Before proceeding with the proof of the theorem^ it would be good 
to interpret the terms on the right-hand side of Equation k* Equation 
h states that the average number of presentations required following an 
error on cyi"*.le k is the sum of the number of consecutive correct responses 
required^ kj^ and the number of extra responses made necessary by further 
errors. The latter terra is the product of 1-e/^ ^ the probability of 



further errors on the cycle; 



"k^k+l 



'f the e;<pected number of errors on 



the cycle; and gJ»^(l)f the average number of presentations per error. 



FRir 



28 



These quantities depend^ of course^ on the exact nature of the particular 
learning model being considered. 

Proof -Of" the theorem. It is obvious that =0 for n < k. The 

-waiting time for k consecutive correct responses is equal to k with 
probability e ^- If W > k, there must have been an error on some . 
presentation v = l^<f..,k. The probability that W = n^ given an error 



on presentation v. is v. . The expression for v, when n > k in 
' k,n-v ^ k^n 

Equation 2 is the weighted avex-age 
these comments justify Equation 2a 



Equation 2 is the weighted avex-age of the ^ ^^s. Taken together. 



If both sides of Equation 2 are multiplied by s^ and the results 
for all values of n added together, the result is 

^ CO k 

k ^ n=:k+l v=l ' ^ 

k , 
„k . 5^ _ -V 1 ^ __ ^^-v 

-V 



= e. , _s + 7 e, si > w, 
k,k+l k.v V ^ , k,n 

^ v=l ' \ n^k+1 ^ 



' k k 



k 

e s 
k,k+l 



which is Equation 3, 

Differentiating Equation 3 yields 

^ (s) i i ^ 

^ [l-(l-e^^,,,)g^(s)]2 

Letting s = 1 and noting that g„-x-(l) = 1, we obtain 

\ 

er|c 29 



k,k+l 



6^(1) = k ^ g^(l) , 

\ k,k+l ^ 



'k,k+l 

which justifies Equation h and completes the proof. 

Recursion formulas for • computing tt, , • The general formulas for 
computing n, , to be given are valid for any model of learning for which 
the initial probability of a correct response is a guessing probability. 
However, they are really useful only if the model is one for which con- 
ditioning on the cycle of last error leads to a simplification or a 
reasonable approximation, as is the case with the GFT* 

Let q, be defined as follows: 
^k^n 

• * 

r P(Error on cycle nj cycle n-l just completed, 

no errors on item yet), for k = 0, 



Xn = < 



^ P( Error on cycle n|cycle n-l just completed, 

last error on cycle k), for k = l,.,*,n-l • 



Similarly, let be defined as 

K, n 

fP(Wo errors on itein] cycle n Just completed), for k = 0 , 
P(Last error was on cycle k| cycle n just completed), 

for k = 1, . . . ,n . 

Note that according to this definition , is the probability that 

K, K 

the last error was on cycle k, given that cycle k has just been completed. 

But this is just the probability that there was an error on cycle k, which 

accords with the definition of jr, , given earlier. Also, note that q, 

k,k ' k,n 

is not defined for n = 1. It is not needed and the definition makes no 
sense in that instance. 



Er|c 30 



Theorem 3.2 . jt, . can be computed in terms of q . 's with j,v < k using 
the following relationships. 



(6b) 



n-1 



' k=l ' 



n 



k-1 



(6d) 



The justification for Equation 6a is that and are the 

respective proportions of items having and not having correct responses 
on the first presentation. Since items are assumed to be unknown at the 
outset^ the values are g and 1-g. ; The relationship expressed in Equation 
6b says simply that the probability of no errors on an item through cycle 
n vis the product of the probability of guessing correctly on the first 
cycle and the appropriate conditional probabilities of not making an 
error on succeeding cycles. 

On completion of cycle n, where n > the proportion of items 

whose last error was on cycle j is the proportion of items with an error 

t h 

on cycle j and no further errors through Xhe n cycle. This is the 
relationship expressed in Equation 6c. 

The formula given in Equation 6d expresses . as the sum of the 
conditional probabilities of error on cycle k, given the cycle of last ■ 
error^ each weighted by the probability that it was the last error. 
O 5 completes the proof* 

ERIC 31 



In order to use Theorems 1 and 2 to find the expected number of 

presentations on cycle k for a specific model of the 'learning process, 

it is usually necessary to have explicit expressions in tems of model 

parameters for the following quantities: 

Ao The q the conditional pjobabilitiep of error on cycle 

Kij^n 

given the last error vas on cycle k^^ 

th 

B. The probability of no further erroz^B on the k cycle follcv.lng 

an error on th&t cycle, e, , 

^ K^k+1 

Co The conditional' distribution of the waiting time for the next 

error following an error on cycle k^ given that there will be 

anotlrier error on the cycle ^ and its mean© 

Before deriving expressions for these quantities in the general 

GPT case^ it should be noted that the calculations can be simplified 

considerably in the important special case where Greeno^s model applies* 

The calculation of the q. 's can be avoided because a formula giving 

K. , n 

the approximation for ;r , can be derived directly© The other quantitie 
of interest are simple functions of jt, , and the cycle numbero 

Theorem 3o2a c> When Greeno's model applies, ?t , can be approximated by 
the following formula c 

when k = 1 
k-2 




_a-hg(l-a}_ 



, for k > 2 * 



Proof » The fonnula is obvious for k 1 and 2o It needs to be 
demonstrated that ^t^^^^^^^i - ^ 4gCl-a) ^k,k^ ^ ^ ^' suffices 

to show that P (in state U on cycle k+l|ln state U on cycle k) 



:RJC 



An item must be I'esponded to correctly in order to complete a cycle. If 
an item is in U at the start of a cycle, one of three things will happen: 
a correct response by gue^.sing and no transition to the long-texro state 
(with probability g(l-a)); a response, correct or incorrect, followed by 
transition to the long-term state (with probability a) ; or an incorrect 
response and no transition to the long-term state (with probability (l-g) 
(1-a)). In the latter case, there follows a make-up sequence of presen- 
tations, which are useless if Greeno's model holds because the item is 
trapped in the short-term state. Then intervening items are presented, 
and finally the item is tried once again. By this time the item is back 
in U (according to the approximation assumption)^ the process starts over 
and is repee.tecl until one of the first two situations obtains* Thus 

H\,^\V = g(l-a) I [(l-g)(l.a)]- , 

v-0 ^ 

as required* 

Approximations Under the GFT ■ 

Theorem 3,3 > A formula for q , In terms of the parameters of the GFT , 
— ^— — ~— — - K , n ^5 

presented in Chapter 2, the conditional probability of an error on cycle 
n, given that the last error was on cycle k, is approximately given by 
the following formula. 

,n-2 



^'^^^k,n^ i 



(l-s)(l-a)[g(l-a)J- 



, for k = 0, n > 2 



L»[g(l^a)]^-^-^(l-g) 



— r — =- , for k > 1, n > k 

Tj 1-S Tt 1-S r / \nn-k-l ' — ' 



KT^ ^ l-g(l-a) 
ERIC 33 



vhere Ijj^ =^ P (In state L|rriake-up sequence of length k just completed) 
and = l-I^. 

Proof * The expression for ^ will developed first, 

P(no errors through cycle n-1, error on cycle n) 
^0,n ^ P(no errors through cycle n-1) 

^ (l-fi)[g(l-a)j"-^ 



g[g(l-a)3""^ + Y ag[6(l-a)]^-^ 



,n-2 



(l-6)(l-a)[g(l-a)]' 



PRir 



as asserted. 

In order to develop the expression for q when k > 1, let the 

n 

events A and B be defined as follows. 

A = {error on cycle n] 

B -- [last error through cycle n-1 was on cycle k}. 

Then 

The event B vill occur if there is an error on cycle unless the item 
is still not in state L at the end of the make-up sequence in cycle k 
and there is another error before cycle n. Hence 

P(B) i j^d - I^(l-g) "'Z^ [g(l-a)]^) 
' v-0 



The event AflB will occur if' there is an error on cycle and a series 
of correct guesses and failures to go to state L on rrabsequenx cycles^ 
ended by an incorrect guGr:s on cycle n. The probability of this event 
is given by 

P(AnB) . n^^^ L^[g(l-a)]^-^"-\l-g) . 

. ^ Dividing this expression by the expression just derived for P(B) yields 

the formula for ^ given in Equation 7. 

As one might expect^ the quantities remaining to be calculated are 

closely interrelated. It is necessary to know in order -'o use Equa"^ 

tion 7 to compute q . It will soon be seen that it is necessaiy to 

K ^ n 

know e in order to find L/. Calculating e ^ involves the deter- 

mination of the distribution of the waiting time for the next error. It 
will help keep repet?ltion to a minimum if we can refer to some basic 
quantities involved in the several separate calculations. The breakdown 
of given in Pigure 3*1 suggests what these quantities might be* Define 
five new random variables as follows. 

= the number of presentations in criterion run on cycle k in 
state U. 

K„^K^ = the corresponding random variables for states S and 

^ - the number of presentations in state U following an error^ 
given that there will be another error on cycle - 
Ef = the correspor.ding random variable for state S» 

K ^ o . 

It turns out that the remaining calculations in this section "will be 
expedited by considering the joint distributions of (K,,,K^^K.) and 

/ . U O XI 

Er|c 35 



u 



03 
C 
O 
•H 
-P 

-p 
c 

0) 

CQ 













o 










H 










0) 










-P 


H 






t 


H • 






o 


cd 






0) 


C 








•H 




g 


-P 






CO 


IP 




* ^ 


c 


M 


9> 

H 




lo 


c 


O 


H 




-p 


o 








Cd 


tH 








nt 


H 
d) 








0) 


-P 








w 














^ *^ 




o 




















•H 



g 

H 

cd 

•H 

-p 

•H 











f 




^ 




1 


w 






1 








c 






1 CO 


! h:i 


c 




o 










o 




•H 


CO 


0) 


t v 


1 <u 


-H D 1 


-P 




-p 


1 -p 


1 -p 


-P 




Cd 


0) 


cd 


1 cd 


1 cd 


Cd 


0) 1 


-p 


-p 


-p 


1 -p 


1 -p 


-p 


-P 1 


c 


cd 


03 




1 U3 




cr 1 


0) 


-p 










•p 1 




0} 


c 


1 C 


1 C 


03 


03 1 






•H 


1 -H 


1 -H 


S 


C 1 


a," 


in 




1 <Q 


1 








c 


t c 


1 c 










o 


1 o 


1 o 








CO 


•H 


1 -H 


1 -H 




tD 1 






-P 


1 -P 


1 -P 


V i 








nta 


nta 






H 




0) 


1 0) 


1 0) 


H 






H 






1 CO 


J- 


H 1 








1 a 


i lli 




i ! 






ft 








^ 1 










0) 
1 

r4 


>i 1 

0) 1 


1 

H 


0) 




1 CO 

1 


! hII 
1 



00 



a 



§ 

CO 

g 

•H 

-P 

Cd 
-p 
C 
a> 

CQ 

o 



o 



03 
C 

o 

•H ^ 

-P W) 
Cd Q) q 

4-^^ H r< 

s g 

M C O 



r-H 
•H 



ERIC 



36 



(Ef ,..^Ef ^) , respectively. Note that 

0 3 L K^J K^S k 



"The joint distribution of (Ej rj^^ 5) ^ conditional distribution 
given that another error is going to occur* For this reason the proba- 
bility of the simple sequence of events that results in the event 
{E* = -^^E* g •= m] for some i and ra must be divided by ^^-^ in 

order to find P(Er = ^;,E^' = rn)o Figure 3*2 shows a classification 
of points (^pin) having positive probability into three types such that 
the probability expressions are similar for points within a type^ Tne 
figure applies to the special case K ~ The reader can easily verify 
the following assertion if he bears in mind that presentations in a make- 
up sequence on cycle k are contiguous until there are k-1 presentations 
receiving correct responses ^ in which case a nuunber of items intervene 
before the next presentation. Let 7 = g(l-a-b). Then 

^ k,k-fl^ ^ k,D ^ k^S ^ 

>^~''"(l-a-b)(l-g) , for ^--1, .„o,K-l; m=0; 
(8) ^ = / y^~^(.l-a)(l-g) , for i=k; jn^O; 

^ 7^"'^ b(l-c)^""^(l-g) , for -e.-l.... ra-k-i „ 

Figure 3,3 gives a olassif ication of points {I ,va.,n) having similar 
expressions for e I-"(K,, = ^.K^ ^ m.K^ = n), A suitable apnroxijna- 

- iC-rJ. (J Jj 

tion for this quantity is given by 



37 



Case I: No passage to S, finally guess 
correctly. 



Case II; 



4 



i = 1 




Whether or not passage to S takes 
place after (k-l)st correct guess 
does not have any bearing, since 
many other items will intervene 
before next presentation. 

Passage to S means that error 
will occur on last presentation, 
f at all. 



k = U 



m = 0 



Figure 3-2. Classification of points (i,m) having similar 

expressions for-(l-e^ k+l'^^^ U " ' S " ^ 

where Ef- ..Ef are the numbers of presentations 
k k^fc> 

in the respective states in a run culminating in 
an error on cycle k\ ^ 



ERLC 




n=U n=3 n=2 n=l n=0 
i 0 1 2 3 k 



Figure 3.3. Classification of points (>5,m,n) having similar 
expressions for e^ ^^-^ P(K^ = £,K^ = m,K^ = n). 
where K^^K^, are the numbers of presentations 
in the respective states during the criterion run 
on cycle * 



ERLC 



(9) 



= < 



7 " g(l-a) , for i=k; m^nsOj 
i 

87 



, for i=0,...,k-l; m?=0; n=k-je; 



i 1 k i 

g7 ^ b(l-c) , for i=l,...,k-l; m=k-i; n=0; 

7^"°^"%(l-c)°^c , for i-k-m-n; m=l,...^k-n; nsl,., 



.,k-l. 



The value of e ^ can be calculated either by adding the results 
of Equation 9 for all ellowable values of {Jl,m,n) or by adding the re- 
sults for Equation 8 over all a possible values of {i,m) and subtracting 
the outcome from 1. The latter course is simpler because it avoids a 
messy double summation. 



k-1 i^-j. 
e. . ^ l-(l-6)(l.a)7^-l - ^ /-l(l-a-b)(l.g) - £ (l-g)7^-^)(l-c)^-^ 



£=1 



k-1 
I— ■ 



k,k+l 
(10) 

Th±i|,.je^ression can be simplified significantly in some important special 



k-1 

= l-(l-6)((l.a)7^-l + (1-a-b) + [(l-c)^-V^]) - 

1- -2- 
l-c 



cases. 



(11) ■ e, 



sk-1 



k,k+l 



1 - b(l-g)(l-c)' ^ , when a+b = 1, c > 0; 
1 . (1-6)(1-^) (1./) ^ when"^a+b c = 0; 

1 - b(l-g) , when a+b = 1, c = 0 . 



A f ornnila for -L' . It was noted above that the formula for q 
given in Equation 7 presumes knowledge of L^. For k = 1, = and 
L^^ = 1 - a because items are Just presented once on the first cycle, 

Er|c ko 



regardless of the correctness of the responses. In order to derive an 
estimate of for k > 2, define events A, B, and C as follows. 
A = (Error on cycle k} 

B = (Error on cycled,-- cycle k completed with no further errors) 
C = (Error on cycle k^^ cycle k completed with no farther errors, 
but item fails to make transition to state L)* 

The fact that CcbcA and P(A) > 0 implies that 

The quantity P(b(A) is e , which was just derived. The quantity 
P(c|a) can be obtained by multiplying Equation 9 by 1-a and adding over, 
those possibilities fo}r which K^^ = 0. That is, • 

^k k+1 "^^^ ~ "^^^S " ^'^L " ^ transition to -L- takes place 

after correct guess on last response) 

7^'\il-a.)^ for £ ^ K- 

g/"-4)(l-c)^"'^(l-a) for £ = l,...,k-l; 
and 

, . Z=l • 

= 7^"^s(l-a)^ + bg(l-a)(l-c)^-l l^)^'^ ' 



er|c ^1 



= r'^-^gd-a)^ . ^Sliisi ((1-=)"-^ - r''-^! 

1- 



l-C 



= g(l-a)[(l.a . — + — L_ 



1-c 



1- 



1-c 



This formula also simplifies in important special cases : 



p(c|a) = 



,k-l 



g( l-a)b(l-c) " ^ , when a+b = 1, c > Oj 
(l-a)7^ + ^fl^y"^^ } "hen a+b < 1, c = 0; 
bg(l-a) " , when a+b = 1, c = 0 . 



Finally, the formula for for k > 2 is 



k,k+l 



(12) 



g(l-a)[(l-a. + _L_ (i-c)^-^l 



1-C 



1-c 



k-1 



l-(l-g){(l-a)7""-^ + (l-a-b) i^2L^ + [(l-c)'^-"-7^-"l) 

1-c 



k-1 



k-1 k-ln 



In the special cases inentioned above, Equation 12 becomes 

,k-l 



(13) 



g(l-a)b(l-c)^ 
l-b(l-g)(l-c) 



^ , when a+b = 1, c > 0; 



^ (l-a)[(l-7)y 4bf^l ^ ^^^^ < 1, c = 0; 

l-r-(l-g)(l-a)(l-7n 



S(l-a)b 
l-b(l-g) 



, when a+b 1, c = 0 



ERIC 



k2 



It might be noted that the formula for q^^ ^ given in Equation 7 
reduces to the formula for q^, when 1-a is substituted for LI* Equa- 
tion 13 can be used with Theorems 3»2 and 3-3 to compute estimates of 
n, , . Theorem 3»1 tells us how to compute EW, (and hence EP, ) given 
Ttj^ \ k+1* the mean waiting conditional waiting time for 

another error following an error on cycle given that there is going 
to be another error. Approximations for all but the last of these 
quantities have been given above. 

The distribution and expected value of E^. Since ^ = ^ ^ ^ 
we can use Equation 8 to write 



v-l 



7'^"-^(l-a-b)(l-6), for v = l,...,k-l; 



7^-^l-a)(l-g) + I /"■'""^(l-c)^1.6}, for v = k; 
m=l 

/"•'■(l-a-b)(l-6), for V = l,...,k-l; 
' /-^l-a- -^Kl^g) + (l-c)^-\ for v(= k . 



k-1 



l-c 



l-c 



In the special cafies, Er.uatlon Ih becomes 



ERIC 



^3 



(1?) 



b(l-g)(l-c) 
0 

v-1 



k-1 



J for V = k; 

, otherwise, when a+b = 1, c > 0; 
y ^(l-a-b)(l-g) for V = 1, ...... , k-1; 



117) (1-g) + -^1^ > for V = k when 

a+b < 1, c = 0; 

b(l-g) , for V = k; 

0 , otherwise J when a+b = 1 and c = 0. 



Straight forward suimiation of series^ omitted here for the sake of 
brevity, results in the following expression for (l-e^^ k+l^*^^'^^" 

(^6) (^■\,k+l)^^^^ 



= (l-a-b)(l-g) 



l-k7^~ "*"+( k- 1) 



^ r k-l,_ ^ b V b( l-g) sK-It 



1-c 



1- 



1-c 



The restriction that c = 0 does not in itself produce any signifi- 
cant simplification in Equation l6. The restriction that a + b = 1 does^ 
however* In this case 



^^-^^ ^^-\,k+i)^^<^ - 



'kb(l-g)(l'-c)^'*"'- , vhen a+b - i; 
kb(l-6) , ^.^en a+b = 1 and c = 0 



Suinma[ry of the calculation of EP^. If the expression for EW^ given 
in Equation h is substituted for EW^ in Equation 1, the result is 



ERIC 



EP = 1 + jr + -J ) . 

If the expressions derive'd above are substituted in the right-hand side 
of this equation, the result is a cumbersome expression for EP^^ In tenns 
of the model parameters. In general^ this expression is iinenlightening, 
so it vill not be reproduced here*, It should be noted, however,, that it 
simplifies in the case a+b = 1 to the following: 

EP = 1 -f ^^^Hn • 

^ _l-b(l-6)(l-c)^ ^ 

Furthermore, 

EP, = 1 + r—rri r j vhen e-^b = 1 and c = 0 , 

k l-b(i-g) ^ 

Other Operating Characteristics of the OEM Procedure 

One of the chief reasons for studying the operating characteristics 
of the GEM procedure (under the assumption that, some model in ther*GFT 
applies) is to find ways of modifying the procedure to ge- better in- 
structional results. In a number of experiments it has been found that 
a > c. When this is the case in an instructional setting, one should 
maximize the citmulative niaraber of presentations in state U* Thus, it is 
useful to considv^r how many ^of * the P presentations of an item on cycle 
k are in states U, S, and L, respe :!tivelyo This information may suggest 
modifications which would increase the proportion of presentations in 
State U. 



I^t P , P , and P. - be the number of presentations in 



the 



respective states on cycle k« Let [x{K.S) , fi(K ), and }-i(K ) be the means 

h5 



of Ky, Kg, and K^^, respectively, and :et u.,, s^, £^ be the respective 
probabilities that an item is in U, 5, or L on its first presentation on 
cycle k. Then 

^k,U ^k,S \,L " -^k ' 



1-e 



EP, 



■kv'\*\,^ ^i;^- 



and 



The task now is to find u^, s^, Z^, ^(Ky), fi(Kg), , (l-e^^ k+l^^'^^ U^- 

""^ ^^-%k^i^^(^J,s)- 

^^2^ ^^-"k^k+l^^^^.U^ ^ ""^^ Vrooe^nre is 

very direct: first find ( 1-\^j^^.i)m(eJ g) by adding weighted terms using 
Equation 8; then find 0--^^. ^) by subtracting the result from 

(l-ej^^j^^-,_^^^^^k'^'' "^ich is given by Equation 16. 



(18) 



k-1 



k-1 



= ^ m/ b(l-c) (1-g) 
m=l 



erJc 



m=l 

, 2 
(l--i^) ■ 



i+6 



f 

Then subtracting Equation l8 from Equation l6, we have 

. (l-a-b)(l-g) ^■^_^^k-l^^^_^,^k^ +-k(l-g)(l-a)/-l 

(1-7)^-—,..-- 

+ [(i_c)^-l-7^] - (k-l)b(l-K) ^ 

As was the case with (l-e^ these formulas can be simpli- 

fied significantly only if a+b = 1. The results for that case are 

fb(l-g)(l-c)^""^ J when a+b 1^ 
b(l-g) , when a+b = 1 and c = 0; 



and 



. k,S (k-l)b(l-g) 



(k-l)b(l-g)(l-c)^""^ y when a+b = 1^ 



J when a+b = 1 and c = 0 



Finding n(Ky), li(Kg) , and ii(k:^) • The approach to calculating these 
quantities is also direct • Expret^^sions for k+l'^^'^S^ k+l^^^L^ 
will be found first because the prooability expressions for > 0 and 
> 0 given in Equation 9 involve only two of the four types pictured 
in Figure 3.3, while those for K,. involve all four types. Once i-l(K^) 
and u(Kj^) have been found^ we know }i(Ky) = k-^i(Kg)-^(Kj^) . 
On the basis of Equation 9^ we can write 



er|c ^7 



ra=l 



; 7 + — (1-c) , for n=l,...,k 



1-C 



Therefore, 



1-c 



1- k ■ V 

DC , k-n be 



1- -^r^ n= l 
1-c 



1- -r^'- n=l 
l-c 



k-n 



'■ ak , when a+b = 1, c = 0; 



, when sf+b < 1, c = 0; 



ak 



/ be . 

1- 



1-c L 



1-7 



(1-7)' 



be 



k (l-c)[l-(l-c)^] 

c 2 
1-c c 

when a+b < 1, c > 0 . 



The corresponding argurrient in the case of e ^(k ) is that 



(23) 



,k+l ^'(^3='"^ = bg7^-'"-l(l-c)"^ + "l bc7^-"^-"(l-c)" 



k-in 



n=l 



therefore , 



ERIC 



1+8 



i2k) 



bg(k-l)_, when a+b = 1^ c = 0; 



k=2: 7(1-7^"^) 



y vhen a+b < 1/ c = 0; 



bg(l-c)^k-l) + ^ b[l-{l.c)^-^ - (k-l)(l-c)"-^c]. 



when a+b =1^ c > 0; 



b 1-c \k-l /, ,\/, xk-1 , 



when a+b < 1, c > 0 



Approximations for ii(K ) j |i(K_) , and |i(K, ,) can now be computed using 

Equations 22, 2k and .the formulas for e given in Equations 10 and 

k ^K+i 

11. For example, in the simplest non-trivial case, where a+b = 1 and 
c = 0, th are given by 

ka 



V i-b(i-gy ' 



and 



}i(Kg) 



■|j(Ky) 



(1^-1) bg 
l-b(l-B) 



bg_ 

l-b(l-iy 



ERIC 



The values of u^, s^, and are needed in order to complete the 
state-by-state breakdown of the average number of presentations on cycle 
These are easily deteimined by the relationships ^ = (l-g)^5 



l-:r^ ^ = + gu^^ and ^ 0. The fact that = 0 is a consequence 
of the approximation assumptions » These relationships imply that 



(25 . 



and 



^, i 1 



k 1-g 

Formulas have been derived which provide for the calculation of the 
main quantities of interest when the OEM strategy is employed. We turn 
now to the calculation of analogous quantities when other strategies are 
used. 

Operating Characte ri stics of Other Strategies 
An Ideal Modification of the OEM Strategy 

Most experiments to date concc rned with the evaluation of the GFT 
frame vorK suggest that parameter a > c If the nuraber of Items being 
presented is large ^ say? greater than 20 j then most of the items will 
not be in the short-ter:n state at a given time. If it could be guar- 
anteed that a21 items to be presented would not be in Sj then the 
question of the optimal item to present would be reduced to the question 
of which item is most likely to still be in state U. The OEM would 
describe the learning procp^ss end the OEM strategy would be optimal. 
Such a modification of the OEM strategy would be ideal if it could be 
accomplished o 

There are a number of ways one might approach this ideal. All items 
could be presented on a given cycle before any .item receiving an error 
response was presented again. Those items - receiving error responses 

ErJc 50 



could then be presented in the standard cyclic fasiiion for the required 
n\Amber of make-up trials p Those items in the subllst receiving no errors 
in the remedial phase would be removed and the others would be presented 
once again and checked against the criterion, and so 'forth. The outcome 
of such a procedure should be that only a few items toward the end of 
the cycle would receive repeated presentations without a fairly large 
number of intervening items. 

It is hard to say exactly how close an approach like the one just 
described would come to the ideal* The matter will net be pursued 
further her,:% But the operating characteristics of the ideal modifica- 
tion are easy to calculate^ because they are ,just the characteristics 
derived in the last section^ computed under the assumption that the OEM 
applies* The results will be stated here without proof ^ because they 
are based on well-known results for the OEM, 

- The mean number of presentations on cycle k- By Theorem 3-1 we 
know that . . 

Consider ^.^ the probability that an item is in state L at the end of 
3 

the j cycle. Successive values of i. can be computed ^by the formula 

i )ag 
J J 

Then we have 



erJc 



51 



either letting b = 0 in Equation l6 or by straight forvi^ard argu- 
ment from the properties of the OEM^, it can be shown that 

(26) K^l^^K> O^^lilzMl^ (l_[g(l.a)]^) - k ^^-%)^^:f^ [g(l-a) 

Similarlyj letting b = 0 in Equation .10 or proceeding by a direct argu- 
ment yields \ 

I 

It is interesting to con:pare these results with ^ell-known results 
for the OEM. Let k become indefinitely large i-n^Eq^'ations 26 and 27, 



Then ve get 



which corresponds to the standard result for the probability of no more 
errors, following an error, when the 0E^^ applies; and 



k-4 0o ^^^--^ ^ ri-fi(l-a)r 



If we let L be the tricl of last error in an infinite sequence of 

/ 

presentations, then the mean of L is given by ^ 

* / 

( 

\ 



EL = (~ ^) . • \ 

a j.-g(l-a)' \ 



Thus 



(l-a)l 






a[l-g( 







52 



The quantity EL is the mean trial of last error, starting from state U., 
With probability a, an item moves to state L following an eiror^ in which 
ca'>e there will certainly be no more errors. V/ith probability 1-a, the 
process starts again from U. 

Other operating characteristics of the ideal modification . By 
hypothesis^ no items in the short-term state are ever presented under 
the modified strategy. Hence, 

which ; .s already beer: calculated. It is easy to show that 

(oQ) e n(K ) • ^^[S(l-a)3'^tg(l-a)-[fi(l-a)]^-a] a[l- [g( 1-a) ]^ } 

The breakdown of EP^ into the mean number of presentations in each state 
is as follows* 

and 

The asymprotic distribution o f the cycle/ of last error . It may 
sometimes be of interest to consider what would happe;ri if either the 
OEM procedure or its idea.l modification were continued for a very large 
nmber of cycles. It is clear that sooner or later all items would be 
learned and there would be no further errors. In fact^ the distribution 
of the cycle of last error can be expressed simply in terms of the tt i^'So 
Define jtv and a as follows o 

Er|c 53 



It* s= P(last error is on cycle k) 

00 

and 

- P(no more errors [error on cycle k) • 



The 

R.GD 

n-> 00 

that 



n jt = lim jt, , and Jt - \. '^'^ follows from Equation Ic 

k^OO K,n K^OO K,K K 



n-k 

n-» CD v=l ^ 



A more profitable way to look at ^ is to condition on whether an item 
is in state L or not following the make-up sequence on cycle k. Then we 
get 

(29) = + P(no more errors] in state U following cycle k) 

= \ + H [g(l-a)]^"^-V 
v=k+l 



= 1 - 



Therefore, the asymptotic probability that the last error is on cycle k 
is given approximately by 

Operating Characteristics of the RC Procedure . 

For purposes of comparison,^ it is desirable to know the operating 
characteristics of the RC presentation procedure used in many experiments 

ERIC 5k 



on paired-associate learning. The usual approach used to derive theoret- 
ical predictions for this procedure vhen models like those in the GFT 
are being considered^ is to multiply the learning matrix by an "average" 
forgetting matrix to get a single transition matrix sumniarizing the 
effects of learning following presentation and forgetting betveen pre- 
sentations. Approximate theoretical predictions can then be made in.., 
terms of this single transition matrix* For the GFT^ this matrix is 
given by 



(31) 


P - 


"l 


0 0 




""l 0 




on 






c 


i-c C 




0 1- 


f 


f 






_a 


b l-a-b_ 




_0 0 




i_ 






"^1 


0 




0' 










c 


(i-c)(i-r) 


(1- 


c)f 










_a 


b(i-r) 


bf'f(l-a".b)_ 







Calculation of the n- stage transition matrix^ P ^ is facilitated 
by noting that the matrix P can be partitioned as follows, where B is a 
2X2. matrix. 

"1 O' 

It follows by simple ai^^ebra^ that 



(32) 



.1 



^-M A B^^ 



i-1 



ERIC 



55 



In the general case, there does not seem to be any particularly 
simple vay to express B^. Of course, it is very easy to carry out the 
multiplications numerically, so the lack of a simple expression causes 
no special difficulty. However, in the case a+b = 1, does have a 
very simple form. If we let X «= (l-f)(l-c) + bf^ it ds easy to verify 
that 

B^ ^ yJ'^h . 

This is so because X is a characterirtic root of B (and of P) and each 
row of .B is a left characteristic vector corresponding to X- Therefore^ 
can be written as 



(33) P" = 



1 . 0 0 

l-(l-c)x""^ (l-c)(l-f)x""^ (l-c)fx""^ 
l-(l-a)x""^ (l-a)(l-f)x""^. (l-a)fx""^ 



In the general case^ two positive values of X are given by the 
formula 
(3M 



X ^ (l-c)(l->f)>fbf>f(l^a^b)^ /[(l-c)(l-f)>fbf^(l-a-b)]^-4(l-.c)(l--f)(l^a->b) ^ 

2 ^ ' 

Fquecion 3^ shows why the case a+b »= 1 is special, A modification of 
Equation 33 using both X*6 is useful in computing in the general case, 
even • nough the theoretical formulas would be very messy in terms of xne 
basic model parameters. 

In the case a+b = 1, X can be interpreted as the proportion of items 
currently in state S or state U which will still be in S or U following 
the next presentation. 

ERIC 56 



If the starting state vector^ [0, 0, 1], is postrrraltiplied by P , 
Che result is the expected "state vector after trial n. Thus 

(35) = l-(l-)^"'' 

s = (l-a)a-f)x""-'-, and 

n 

u = (l-a)fx''"-^ . 
n 

Equation 35 can be used to derive all the operating characteristics 
one desires. The details will be discussed in the next chapter, in which 
the operating characteristics derived in this chapter will be used in 
numerical comparison of procedures. 



57 



CHAPTER IV 

NUMERICAL COMPARISON OF PRESENTATION STRATEGIES 



The purpose of this chapter is to compare the three presentation 
strategies we have been considering, using the formulas which were de- 
rived in the last chapter with specific parameter values to make pre- 
dictions. Two kinds of questions are of particular interest. One 
concerns comparison of strategies given a certain set of parameter values. 
For example, how big is the diffei^ence between the BC procedure and the 
OEM procedure in terms of how many items are in the long-term retention 
state after a given number of presentations? How big is the difference 
between the OEM procedure and its ideal modification? Another kind of 
questiorx of interest concerns how answers to the first kind of question 
vary as a function of parameter values. Do changes in the rate of tran*- 
sition from the unlearned to the long-tem state affect the size of the 
differences between the OEM procedure and the RC procedure? In order 
to address these questions^ operating characteristics of the three 
strategies have been computed for three different sets of parameter 
values. It might be helpful at this point to say a few words about the 
particular values that were chosen. 

From the point of view of ease of calculation the best parameter 
values to choose satisfy the- constraints of the model proposed by Greeno 
(1966): c = 0 and a + b = 1. That is, the probability of a presented 
item making a direct transition from the short-term to the long-term 
state is 0 and the probability is 1 that a presented item in the uncon- 
ditioned state will make a transition either to the short-term or to the 



ERIC 



•long-term state. Expressions for operating characteristics of the OEM 
procedure are considerably simpler in this case than they are in general. 

It wi::s noted in the previous chapter that the OEM procedure leads 
to cyclic pi'esentation of items ^ with repeated presentation of items 
receiving error responses on a given cycle. If Greeno's model holds ^ 
these massed presentations are useless. Immediately following the first 
presentation of an item on a given cycle the item is either in the 3,ong- 
teim or short- terai retentioa state, because a + b =^ 1. It is unnecessary 
to present it again if it is in the long-term state and it is useless to 
present it again immediately if it is in the short-tem staxe^ because 
c = 0. We want to examine the predictions of Greeno's model in some 
detail because we would expect them to differ sharply from the predic- 
tions of the OEM. 

In contrast to Greene's model the LS-2 model proposed by i^^tkinson 
and Crothers {l^6k) makes almost the same predictions for operating 
characteristics of the OEM procedure as the OEM itself. As in Greeno's 
model a + b - 1, but in the LS-2 model c - a. . since the probability of 
the presented item m.aking a transition to the long-term state is the 
same whether the item is in the short-terru or unconditioned state^ the 
fact that a + b =: 1 does not diminish the value of the massed presenta- 
tions which occur under the OKM procedure. Greene's model and the LS-2 
model make the sarrie predictions for the ideal modification of the OEM 
procedure and for the P.C procedure because these predictions depend only 
on ai^ the transition probability between the unconditioned ft.nd the long- 
term state. For this reason it is unnecessary to calculate the operating 
characteristic for the 1^-2 model directly unless one is interested in 

Er|c 59 



the small, detailed differences between the LS-2 and OEM models under 
the OEM procedure. 

Both cases mentioned so far satisfy the constraint a+b = !• One of 
the interesting features of the data presented by Rumelhart (I967) was 
that the accuracy of predictions within the GFT framework could be signi- 
ficantly enhanced by allowing a+b < 1. This suggests that operating 
characteristics might be different for these two cases also. We will 
compare the prediction of two models which differ only with respect to 
whether a+b = 1 or a+b < 1. It will be seen that, con*crary to our ex- 
pectations, the differences in operating characteristics based on the 
two sets of parameters are minimal. 

The reason the differences are small has to do with the most im- 
portant variable influencing the relative performance of the strategies: 
the transition probability from the unlearned to the long-term state, a. 
When a is relatively large, as it is in Rumelhart's experiment, the dif- 
ferences between the three strategies are moderate and relatively 
insensitive to the values of the other parameters. When a is relatively 
small, the differences are pronounced and dependent on the values of the 
other parameters. These points will be expanded upon in an analysis of 
the detailed predictions for the three strategies using the three sets 
of parar/ieter values given in Table i|«l. 
Predictions of Gr^eeno's Model When Learning is Slow 

Atkinson and Crothers il96h) compared the fit of several models, 
including their LS-2 model, on eight different sets of experimental data. 
Of the eight experiments, the rate of laming was slowest in an experi- 
ment conducted by Hansen (1963) 'W'ith four and five-year old nursery 

ERIC 60 



Table k.l 

Three Sets of Parameter Values to be Used to Generate 
Theo2ret-5 cal Predictions of Operating Characteristics 
Under the Three Presentation Strategies 



Case Parameter values Experiment 

a b c 'f g 

1 .129 .871 0 .dkk ,250 Four and five 

year old children. 
Hansen (I963) 

2a .1+10 .590 0 .954 *333 - University under- 

graduates. Rumelhart 

2b .380 .360 0 .702 .333 {1967). Both cases 

fit data from same 
experiment. Case 2b 
relaxes constraint 
that a+b^l- 




school children. Atkinson and Crothers report parameter estimates for 
the LS-2 model for this data. We are more interested in predictions for 
Greeno^s model, for reasons described above. In order to adapt the 
parameter estimates for the LS-2 model to Greeno*s model, ve use the 
fact that if a^ and f are parameters of the LS-2 model and a is the 
learning rate in Greeno^s model, then Greene's model with a. = a*f and 
the same value cf f will yield exactly the same predictions for the RC 
procedure as the LS-2 model. The parameter values for Case 1 given in 
i+«l vere obtained using this adjustment. 

Before considering the operating characteristics predicted from 
these parameters^ the reader may vrish to review the breakdown of presen- 

r 

tations on a given cycle under the OEM procedure which is summarized in 
Tables i+«2a and ^.2b. Table i+«2a gives general terms and their explicit 
expression in the case of Greeno*s models Table 4. 2b identifies the 
terms used in the formulas in Table ^<.2a. 

It is worth noting that under the OEM procedure Greeno's model pre- 
dicts that the average number of presentations in the unlearned state on 
a given cycle, EP is a constant multiple of tt, , , the probability of 
error on the cycle. The number of presentations in the short-terrn state, 
EP ^ on the other hand, depends on the product (k-l)7r . The overall 
expected number of presentations on cycle k is 1 plus a constant multiple 
of It ko These observations provide a basis for determining whether or 
not the OEM procedure will be far from optimal under Greeno's model. 
Massed presentations are a real problem only on later cycles where the 
criterion run is long. If 7t k converges to 0 relatively fast with 
increasing k^ they are not a problem* If not, the OEM procedure will 

RIC ^ 



41 



10 

§■ 

•H 

cd 
+> 
c 

0) 

o 
u 



o 



a 

c 



o 



CO 

^ s 

CO £ 

^ 0} 

> a 
o 

O Pi 

g. J 



t 
I 



to 



r^ 



60 



if I 



a. 



60 

to 



I 



1^ 



CO 







• 












+ 








• 


t 




D 





O 

S 

cd 
u 



CM 



4} 



cd 
a 
u 
& 

Pi 

& 



I 

(Q C p 
M O H 

Vl Q 
(D O 

^ ti t! 
V a H 

$ 85 

m Pi O 



s 



cd 



0) 

o 

4( 




63 



Table h.2h 
Key to Terms Used in Table U.2a 



^f^-^ -j_ = probability item is in state U (or L) following 
cycle k-1. 

^k,k = ^probability of an error on the first response in 
cycle k* 

e ^ = probability of no further errors on cycle k, 
following an error. 

— = expecuud number of errors on cycle k, given that 



^^^"^■^ at least one error occurs on the cycle • 
(l-e )[-L(Ef) = expected waiting time for the next error 

K ^ iCH" X K. 

on cycle k, in terms of number of responses 
following an error, given "that another 
error is going to occur. 

(l-^k,k+l^^^^k,u)^^l-k,k+l)^(^,S^ = breakdown of 

state. 

fi(K„) ,[-l(K-) , [-l(K ) = expected number of presentations in the 
U S L 

respective states on the criterion run 
on cycle k. 

EP ,EP ,.,EP T expected number of presentations on 

k k,u k^o k^Xj 

cycle k, with breakdown by staters. 



Er|c 6k 



be far from optimal. This can be seen in the precient case^ where learn- 
ing is slow, by comparing the performance of the OWl procedure and its 
ideal modification.* 

The ideal modification of the 0EI4 strategy serves two purposes in 
the comparisons to be made now. It presents an upper bound on how much 
the OEM strategy can be improved by clmply manipulating the number of 
intervening items between pref^exxtations of an item on a cycle. It also 
gives indication of the discrepancy between the predictions of the OEtA 
and of Greene's model for the 0Et4 strategy^ since its operating charac- 
teristics are what the OEM would predict, with or without the modification. 
When learning is slow, this discrepancy is pronour.ced, as may be seen in 
Table Uo3. 

The efi'ect of the two procedures on rr, , is the same f6r the first 
two cy les, so rr^ = «4i0 x'or both of them. But the OEM procedure re- 
quires 5^77 presentations per item to i-educe error probability to this 
point, whereas the modified OEM procedure requires only k^OJ presenta- 
tions. The size o:^ the discrepancy increases for the next several 
cycles. Two more cycles under the moaified procedure reduces rr, to 
cOUy a point that requires five additional cycles to reach under the OEM 
strategy. In terms of number of presentations per item, the . comparison 
is 22,67 versus 10. 90 presentations, a difference of more than 100?^». 

It is also interesting to compare the probability that an item is . 
in the long-term state after a given number of cycles under the modified 
and unmodified OEM strategies with the corresponding probability for r.n 
item receiving the same iramber of presentations under the RC procedure. 
The difference between the modified OEM procedure and the RC procedure 

® 65 

ERLC 



Table J4.3 

Predictions of Greeno's Model for Selected Operating 
Characteristica-'Of the OEM, Modified OEM, and RC 
Procedures When Learning is Relatively Slow* 



(a) OEM procedure 



Cycle 
n\Mber 
k 

1 
2 

3 

k 

5 
6 

7 
8 



Probability 
of error on 
cycle k 

.750 

.653 
.hio 

.258 
.162 
.102 
.06U 
.Oho 



Expected cum- 
ulative number 
of presentations 
through cycle k 

1.00 

5.77 
10 = 32 
lk.29 
17.62 
20. 3B 
22.67 
2I+.6O 



Probability 
item is in 
long-term 
state after k 
cycles under 
OEM procedure 

.129 

.i*53 
.656 
.78I4 
.86k 
.915 

.966 



Probability item 
is in long-term 
state after same 
nUi-3ber of presen- 
tations under RC 
procedu re 

.li?9 
.J+9;' 
.702 
.8x2 
.372 

.907 
.928 
.9^3 







(b) Modified OEM procedure 




1 


.750 


1.00 


.129 


.129 


2 


.653 


J+.07 


.i*53 


.390 


3 


.kio 


8.37 


.798 


.627 


k 


.151 


10.90 


.9^*8 


.722 


5 


.039 


12. 3U 


.988 


.761^ 


6 


.009 


13.i*5 


.997 


.793 


7 


.003 




.^999 


.816 


8 


.000 


I5.J+9 


J-^00 


.836 



♦Note: Parameter values: a = .;ii29, g = .25. 



ERIC 



66 



is dramatic in this respect, while the difference between the unmodified 
OEM and RC procedures is small and actually in the wrong dire'ction for 
the first five cycles. See the last two colunins of Table km3 for these 
comparisons . 

Predictions of Gree.no 's Model When Learning is Rapid 

Parameter values used to obtain predictions of Greeno's model wb/*n 
learning is rapid are given in Table ^.1, Case 2a. They are reported by 
Rumelhart (I967) to be the minimum chi-square estimates, computed by a 
grid search, for data from an experiment involving Stanford undergraduat 

One property cf Greeno's model which has already been noted is that 
the probability of error on a cycle is the same for both the modified 
and unmodified procedures for the first three cycles. After the first 
three cycles, It drops more rapidly for the modified procedure. When 
learning is slow, this results in notable differences between the two 
procedures in terms of , , for k > 3* In the present case, learning 
is so rapid that there is little room for the , 's for k > 3 to differ 
because they are all rlose to 0. Even though the Jtv v'^ close for 
the two procedures, it is conceivable that the procedures differ in term 
of the number of presentations required to complete cycles. They do In 
this case, but only nlightly. For example, it takes 3. 30 presentations 
on the average to finish two cycles under the OEM procedure and 3*^-^ 
presentations under the modified procedure. 

By referilng to Table k.k, the reader can see that the small dif- 
ferences betwe'-n the modified and unmodified OEM procedures are typical 
of the differences that can be considered.. All the differences favor 
the modified OEM procedure, as they must, but none of the differences 

Er|c 67 



Table h.k 

Predictions of Greeno's Model for Selected Op^^ rating 
Characteristics of the OEM^ Modified OEM, and RC 
Procedures When Learning is Relatively Rapid* 



(a) OEM procedure 









Probability 


Probability item 








item is in 


is in long-term 






Expected cum- 


long- tern 


state after same 


Cycle 


Probability 


ulative nmber 


state after k 


number of presen- 


number 


of error on 


of presentations 


cycles under 


tations under KG 


k 


cycle k 


through cycle k 


OEM procedure 


procedure 


1 


.667 


1.00 


.hio 


.1*10 


2 


•393 


3.30 


.809 


.807 


3 


.128 




.938 


.915 


1» 


.OUl 


6.20 


.980 


.955 


5 


.013 


7.31 


.992 


.97I4 


6 


.005 


8.36 


.998 


.9BU 






(b) Modified OM 


procedure 




1 


.667 


1.00 


.hio 


.UlO 


2 


• 393 


3,1k 


.809 


.793 


3 


.128 


it. 67 


.957 


.902 


U 


.029 


5.82 


.991 


.945 


5 


.006 


6.f/5 


.996 


.967 


6 


.001 


7.86 


1.000 


.980 



*Note: Parameter values: k = .kl, g = .33. 




68 



are large. Examination of the last two columns of Table k*k also reveals 
that when learning is rapid, Greeno's model predicts that the differences 
between the OEM procedure, modified or unmodified, and the RC procedure 
will be slight. 

Predictions of a More General Model When Learning is Rapid 

The parameters of Case 2b in Table k.l are what Rumelhart obtained 
for the data Just described when he relaxed the requirement that a+b 1. 
Predictions of operating characteristics of the OEM procedure made by 
Gi'eeno's model and the more general model are compared in Table 
The differences in the predictions are very slight indeed. In general, 
one would expect there to be a difference in the predictions the two 
models make regarding the expected number of presentations in the short- 
terra state per cycle. In the present case the model for which a+b < 1 
predicts about a third fewer presentations in the short-term state, than 
does Greeno's model, but the rate of learning is so great that the 
number of these presentations is predicted to be small by both models* 
Summary of the Relative Performance of the Three Strategies 

It liiight be helpful to review some properties of the special cases 
of the GFT we have been considering before summarizing the resuJ'':s. No 
cases have been examined for which c > a. There are two reasons for this 
omiscion: first, no experiments have been reported for which c > a, at 
least to the author's knowledge; second, the case c > a radically modi- 
fies what is desirable in a strateg;^, because it is then desirable to 
present items which are in the short- tem state. Tiie question cf good 
presentation strategies for this case would be Interesting in itself if 
situations arise where- It applies* Among cases where c < a, ^re have 



Table 

Compari;;on of Predictions of Greeno's Model 
and a Model that PemrJ-ts a + b < 1 



Probability that Expected number of 



niimber 


L at start 


C ^ <3 +• O 

Of 


cnjunulative presen- 
tations through 
cycle k 


Expected nximber of 
presentations of items 
in state S on cycle k 




a+b=l 


a+b<l 


a+b=l 


a+b<l 


a+b=l 


a+b<l 


1 






1.00 




• KJXJ 


nn 
• uu 


2 


.kl 


.38 


3.30 


3.37 


.38 


.27 


3 


.81 


.81 


^+.93 


^.99 


.25 


.18 


h 


.9U 


.9U 


6.20 


6.2k 


.12 


,08 


5 


• 98 


• 98 


7.31 


7-33 


•05 


^ .03 


6 


• 99 


• 99 


.8.36 


8.37 


.02 


.01 


*Note : 


For Greene's 
For the more 


model parameters 
general mode^ a 


are a = . 
- -38, b = 


41, b = .59. 

.36. 





ERIC 



considered or can guess what would be predicted for models having ex- 
treme values of the three parameters. That is^ we have some basis for 
saying what will happen for a large and small, for 1=0 and b = 1-a, 
for c = 0 and cW a. The predictions are as follows. 

1- When a Is large^ differences on other parameters are not im- 
portanto The operating characteristics of the three strategies 
are such that the modified OEM strategy has a slight advantage 
over the OEM strategy and the OEM has a slight advantage over 
the RC strategy- If a = c^ the modified and unmodified OEM 
are practically identical, 
2o When £ is small and c = 0, the modified OEM procedure is far 
better than the other two. In this circumstance^ the RC pro- 
cedure may even be slightly superior to the OEM procedure. 
When a is small and £ = the modified OEM is much better than 
the RC procedure^ but not much better than the unmodified OEM 
procedureo 

Regardinf;' ..he importance of parameter b^ we may say: 

a. It is not important when _a is large. 

b. If a is small and £ = 0> b determines the relative perfor- 
mance of the OEM and modified OEM procedures* *?he wcv^-t 
case for the OEM procedure is ^when b = 1-a and tlie best 
case when- b = 0- (in the latter case, the OEM and modified 
OEM operating characteristics are identical*) 

Cu If a is small and £ = the size of b is of little 
importance. 



ERIC 



71 



Some of the qualitative conclusions suggested here could be deduced 
heuristically without calculating the operating characteristics in nu- 
merical terms. The numerical calculations serve to transfom the vague 
generalizations which could be made without thsa into assertions whose 
meaning can be made as precise and detailed as one wants. 



ERIC 



72 



CHAPTER V 
CONCLUDING DISCUSSION 

Attempts to deduce inst3ractional implications from psychological 
theories or einpiric:al generalizations may be ciridely classified as be- 
longing to one of cwo types. One type of deduction is very : nf ormal^ 
perhaps^ but not necessarily because the relationship on which it is 
based is only loosely foimulatedo Practically all deductions of impli- 
cations for instructional practice were of this type until ten years ago. 
At that time^ the success of some very explicit mathematical theories of 
simple learning processes led a few investigators to try more fomal 
derivation of instructional strategies. Because the explicit mathematical 
statement of the consequences of instructional acts makes it possible to 

Si 

formulate the qu.cijtion of optimal instruction policy in completely unam- 
biguous terms^ it is natural to seek the answer to this question- The 
study undertaken in this paper is closer in spirit to this latter type 
of approach^ but it does involve what some might regard as regressive 
elements of the first approach. 

It was argued in the second chapter with regard to the question of 
item presentation strategy^ that the globally optimal strategy corres- 
ponding to the GFT is too complicated to be of central interest. But 
what is of interest if the optimal strategy is not? Surely^ if a reason- 
able model of the process of learning items exists^ it should be possible 
to use it to make judgments about presentation strategies^ even if it is 
not practical to work with the globally optimal strategy based on that 
model.- The problem is that the bases for the judgments mc.y come to 

O 73 

ERIC 



depend somevhat on the biases and preferences of the individual investi- 
gator. For example, investigator A may argue for the strategy maximizing 
immediate gain vhile investigator B pushes a modification of the OEM 
■ strategy^ both justifying their choice on the basis of the GFT. Some 
theorists would regard this as an unpleasantly avkvard situation; others 
would see nothing wrong with it, Tukey (1962) ,, for example, has stated 
that the question of which statistical procedure is "optimal in a given 
sit-uation does not interest him until he knows of four sensible alter- 
natives that have demonstrably different properties* At that pointy 
lack of a criterion for choosing between the alternatives becomes a 
concern* It may happen that none of the alterna+ives is globally op- 
•timal, but one or more of them is veiy nearly optimal. If a theoretical 
analysis could identify such ti situation when it occurs it would be very 
helpful^ even if the analysis does not yield an optimal procedure • 

The descriptive analysis of /three presentation procedures under GFT 
as5um]>tions given in Chapters III and IV provides some basis for saying 
when a strategy is nearly optimal. When learning is very rapi(^.^ for 
example^ both the RC and the OEM strategies are very nearly optimal, 
independent of the exact parameter values^ When learning is extremely 
slow, the RC procedure is poor^ and how bad the OEM procedure is depends 
very much on the exact parameter values » These theoretical results are 
consistent with the results of the few empirical studies that have been 
done. Whether or not they will hold up under more direct experimental 
scrutiny is an open question. 

There are a number of limitations imposed by the scope of this 
study which would need to be considered before applying the conclusions 

FRir 



in a particular list learning situation* It has been assumed that the 
GFT adequately describes the process governing learning and retention; 
that the item:; V^c^y^ neutral transfer value with respect to each other; 
that all the items are unknown at the start of instruction; that corres,- 
ponding learning para:ifieters are equal from one item to the next; and 
that the reward structure can be taken to be a simple function of the 
overall probability of correct response at the end of instruction. One 
or more of these assumptions are very likely to be violated in practice. 
Violations may or may not have damaging consequences for a given strategy* 
There are- some relevant studies that relate to some of these consequences. 
Let us review some of them now. 

The adequacy of the GFT framework s It is almost certain that the 
GFT framework could be shovm to be an oversi.mplif ied account of the 
process of learning and retention ^ The phenomena of hiiman information 
processing are now being studied with particular intensity^ At times 
it seems as though important new developmentv=i ajre appearing monthly* 
In a climate of such intense experimental and theoretical inquiry all 
bets are off concerning the adequacy of any simple model. One aspect 
of the GFT which is suspecr. concerns its representation of what happens 
to an Ixem which 13 v.ot presented on a given trial. The GFT ass-ome^ 
that no learn^-ng takes place in this situation. But suppose a subjv^ct 
surreptitiously rehearses an item for a few trials after it has been 
presented. The GFT assumes that ti'ansitions to the long-term state 
could not take placf" via such a process. In fact; in a very successful 
model of h'anian memory proposed by Atkinson and Shiffrin (1968)^ such a 
rehearsal process plays a c^r^ntral role. It is beyond the scope of this 

Er|c 75 



paper to guess what the implications of other models of learning and 
memory might be with regard to item presentation strategy. But it is 
important to note that the GFT is not the only way that xhe memory pro- 
cesses can be modeled. For purposes of this study it suffices that it 
is a reasonable way to model the process. 

Heterogeneity of items . The assiunptions that items are all unknown 
at the start of instruction and that their learning parameters are equal 
from item to item are almost certain to be violated unless extraordinary 
measures are taken to insure that they hold. The necessity for such 
measures notably reduces the general applicability of the procedures 
requiring tnem. It may be that a procedure will. i)e" reasonably robust 
with respect to minor violations of homogeneity. Calfee (197O), for 
example, carried out some numerical calculations which suggest that this 
is the case for the OEM" procedure, provided the other OEM assximptions 
are satisfied. We might argue, by analogy, that a suitably modified 
OEM procedure would stand up pretty well under xiinor deviations from 
item homogeneity, provided the other assu^nptions of the GFT hold. If 
item heterogeneity is extensive, however,' a couple of studies have 
shown that parameter-depend^-nt strategies which take these item differ- 
ences into account will out-*perform the OEM procedure. See Laubsoh 
(1969), and Atkinson and Paulson (l97S). 

The importance of item' heterogeneity has been noted; it should also 
be noted that in order to implement a parameter- dependent strategy the 
key problem is to find suitable parameter eb^imates. • These estimates 
must separate subject and item differences, so we are led indirectly to 
a consideration of individual differences. The consideration of subject 

^ 76 " 



differences required has an interesting twist to it: it is not enough to 
estimate the state of knowledge of a subject; one must also measure in a 
fairly "direct" sense the subject's ability to learn. These measure- 
ments may have some important implications for the concept of intelligence 
and its assessment- In a- symposium on the nature of intelligence, Hunt 
(1972) described some exploratory studies of persons of above average 
intelligence who could be classified into two groups on the basis of 
being more quantitatively than verbally oriented, and vice versa. These 
subjects were given continuous memory tasks like the one described by 
Atkinson and Shiffrin (1968), and model pax-ameters were estimated for 
their model* Consistent individual differences between subjects were 
found; these differences were 'meaningfully related to differences in 
their independently determined profiles of ability- Parameter-dependent 
strategies should be of ' continuing practical and theoretical interest. 

Other crucial assumptio ns. It is patently clear for some kinds of 
curriculum material that some ways of sequencing the material make sense 
and others do not* When there is a natural sequence or hierarchy in the 
material to be learned>^good presentation strategies must take them into 
account. Such strategies are beyond the scope of this study. Also be- 
yond the scope of this study are situations where the items to be learned 
are differentially weighted in terms of their Importance, Smallwood 
(1970) has considered this kind of problem, using an OEM theoretical 
^* framework which allows for item heterogeneity. 

Final Remark 

It should be apparent frori; the preceding discussion that the item 
selection problem is not a single problem with a single solution, but 

erJc 77 



rather is a T§rnily of pi-obllms representing a vadr range of educational 
situations in which the question *&f optimal procedure is open- The st-'.dy 
described in this paper has." addres. '^d 'ono of these problems* Other 
Studies examining "some of the *at.her prcbleins which "lemain unre?oolved cu^ 
♦in progi-ess, .It^is the goal of all o| t}.i^J:e {Studies to develop geheral 
methods vhich may be used to attack more complex optimi^^ation problems. 
Three universal aspectfi of problems of optimizing ins:truction^"are empha- 
sixed: (2r.) the development of an adequate deficription of Ihe^ learning 
process"^ (2) the assessment of costr? and' benefits, associated with pos- 
sible instinictional actions and-states of learning, and ( 3) the ' de3:lvation 
of optimal strategie.js ■ based on the goals set for the student. 

The fomiat of the list learnl i^g^- task is simple enough that all thr^ee 
aspect-s of optimi^i^ati an pi^obleras .mentioned above can be subjected to de^' ~^ 

tailed exDerimental and theoretlGfj:l a^-alvsi-3< In addition to '.research 

^ . - .1 ' ' * • ■■ '■ / . ■ ^ • ■ . - ■ ■ ^..-^ " 

described in this -pape.r; Instructional - strategies vhich explicitly -take 

individual differences into account are^now being studied, ' There Should. 

be studies in the near f-titure utilizing organizational features of the 

material'^ to be. learned ip constractihg optimal strategies/ '. While the 

direct implications which can be dravn fron'j such "formal optimg-zation ' 

studies of. list learning are necessarily' lirritedV -yhe' fact 'that many 

prototypical educational pi'obiems remain unresolved even vitbin this 

reiStiricted context justifies continued expenditu.re of ;effort at this 



REFERENCES 



Atkinson, R. C'. , & Crothers", E* J. A comparison of paired-associa-^e 

learning models having* diffe,rent acquisitic^n and retention axioms. • 
' Journal of Mathematical Psy-Iiulog y/ I96U, 2, 285-315' 

Atkinson, R* C. , & Paulson, J. A. . An approach to the psychology of 
instruction. Psychological Bulletin , 1972, 78, .^9-6l- 

Atkinson, R, C. , & Shiffrin, R* M, ' Human memory: A proposed system and"^ 

' •» 

its control processes. In K. W* Spence and J. T. Spence (Eds/ % 

- . -■ ■ ■ • ■ ' 

The psychology of learning and motivation ; Advances in research and 

theory . Vol* II, - New York: Academic Press, 196^'. 
Bembach, H. A. A forgetting model for pained-associate learning. 

Journal of Mathematical Ps ychology , I965, 2., 128-li^4i 
Bower, G. H. Application of a model to paired-associate learning. 

Psychometrika , .1961; 26, 255-280. 
Calfee, |!. The role of mathematical models in optimizing instruction. 

Sci^ntia: . Revue Internationale de Synthese Scientif ique 3 I97O, 105; 

1.25. "\ ' 

Dear, R, E. ,! Silberman, H. F., Estavan, D. P. , & Atkinson, R*' C An 



optimal 



strategy for the pipsentation of paired-associate items^ 



Behavioral Science , I967, m,j,l-13. 



Feller, W. introduction to probability theoiy and its applications , 

■1 ^ li 

.Vol. II.; New York: John Wi/ley, I966.. 



Greeno, J. 



Paired-associate /learning with massed and distributed 



ERIC 



repetition of items. JourpSA- ^-of Experimental Psychology , ,196^., 

61, 286-1^95, 



79. 



Greene, J. G. Paired-associate learning vith frfetfrf^rm retention: 

Mathematical analysis and data. Prepublication, I966. Cited by 

Rumelhart (I967). 
Hansen, D.s N. Paired-associate learning with young children. 

Unpublished doctoral diteertation, Stanford University, 1963- 
Hellyer, S. Supplementary report: Frequency of stimulus presentation 

and short-tem decrement in recall. Journal of Expe rimental 

Psycholo^ , 1962, 6^, 650. 
Hunt, E. Pap^-^ presented In a symposium on the theorj? of intelligence 

at the Western Psychological Association convention in Portland, 

Oregon, April 1772. 
Karush, W., & Dear, R- E. Optimal stimulus presentation strategy for 

a stimulus sampling model of learning. Journal -of-M ^thematical 

Psychology , 1966> 3, t9-i+7. 
Laubsch,. J. H. An adaptive teaching system for optimal item allocation. 

Technical 'Be-jiSvt 151^ Institute for Mathematical Studies in the 

Social Sciences, Stanford University, I969. - - 
Rumelhart, D. E. The effects of interpresentation intervals on per- 

formance in a continuous paire^-^associate task. Technical Report 

116, Institute for Mathematical Studi'ss in the Social Sciences, 

Stanford University, I967. 
Smallwood, R. D. The analysis of economic teaching strategies for a 

simple learning model. Journal of Mathematical Psychology , 1971, 

3, 285-301. 

Tukey, J. W. The future of data anaily^is* The Annals of Mathematical 
Statistics , I962, 33, I-67. 



- - DISTRIEUTION LIST 



N avy 

h Dr. Marshall J. Director 

Personnel & Training Research Programs 
Office of Naval Research 
Arlington^ VA 227.17 

1 Director - ^ 

ONR Branch Offico 
• 495 Summer Street 

Boston MA 02210 - 

ATTN: C. M. Harsh 

1 Director 
^ ONR Branch Office 

1030 Eaist Green Street 
Pasadena, OA 91IOI 
ATTN: E* Gloye 

1 Director - r 

ONR Branch Office 

536 South Clark Street j 
Chicago, IL 60605 
' ATTN: M. A. Bertin 

1 .Office of Naval Research 
Area Office 
207 West 24th Street 
New York^ NY 10011 

6 Director 

Naval Research Laboratory . 
. Code 2627 

Washington, DC 20390 

12 Defense Documentation Center 
Cameron Station, Building 5 ' 
5010 Duke Street ■ 
Alexandria, VA 223 lU 

1 Chainri^n^ 

Behavioral Science Department 
Naval Command and Management Division 
U^S* Naval Academy 
Luce Hall 

Annapolis, 'AB 2lk02 

ERLC 



1 Chief of Naval Technical Training 
Naval Air Gtation Memphis (75) ' 
Millington, TN 3805^^ 
ATTN: Dr. G. D. Mayo 

1 Chief of Naval Training 
Naval Air Station 
Pensacola, FL 3250^ 
ATTN: Capt. Allen McMichael 

1 LCDR Charle^s J. Theisen, Jr* , MSC, USN 
i^02U 

Naval Air .Development Center 
Waminster, PA 1j897U 

1 Commander 

Naval Air Reserve 
Naval Air Station 
Glenviev, IL 6OO26 

. 1 Commander . ' 

Naval Air Systems Coramanl 

Department of the Navy 

^ AIR-1+13C 

Washington, DC 2036O 

1 Mr. Lee Miller (AIR.lfl3E) 
Naval Air Systems Command 
5600 Columbia Pike 
Falls Church, VA 2201+2 

1 Dr. Harold Boo her . 
. NAVAIR 1+15C ■ 
Naval Air Systems Command 
5600 Columbia Pike 
Falls Chuisch, VA 220^2 

1 Capt. John F, Riley, USN 
Commanding Officer 
U.S. Naval Amphibious School 
Coronaau^ CA 92155 



1 



1 Special Assistant for Manpower 
OASN (MScRA) r- , \ 
The Pentagon^ Room 41^9^ 
Washington;, DC 20350^. 

1 Dr. Richard J. Niehaus 

Office of Civilian Manpower Mr- agement 
Code 06A 

Depart:;ient of the Na\'y 
Washington, DC 20390 

1 CDR Richard L« Martin, USN 
COMFAIRMIRAMAR ?-1h 
NAS Mirsmar,, OA 921I+5 ^ 

1 Res>:^arch Director, Cc w 06 

Research and Evaluation 'Department 
U.S« i^l aval Examining Center' - 
GreaT Lakes ^ IL 60088' : 
ATTN; C. Winievicz 

1 Chief . . 

Eiireau of Medicine and Surgery 
Code i+13 

Washington^, DC 20372 



1 Program Coordinator 

Bureau o|; Medicine and Surgery (Cede 7IG) 
Department of the Navy . 
Washington^' DC 20372 

1 f:onuT;ar;ding Officer 

Naval Medical Neuropsychiatric 

Research Unit 
San Diego, CA 92152 

i ■ . 

X Jc'nn J. Collins 

Chief of Naval Operations (OP-987F) 
Departnient of the Nav^^ ' 
Washington^ DC 203^0 

1 Technical Library (Pers-llB) 

Bureau of Naval Personnel 
■ Department of the Navy 

Washington^, DC 2036O ■ * . 

1 Technical Director \ ' • 

Naval Personnel Research and 

Development Center 
San Diego, CA 92152-/- 



1 Commanding Officer 

Naval Personnel Research and 

Development Center 
San DiegO;^ CA 92152 

1 Superintendent 

naval Postgraduate School 
^ Monterey, CA 929^40 - 

ATTN: Library (Code 212^4 ) ■ ' 

1 Mr, George N. Graine 

Na>;^al Ship Systems Comr,iand 
X SHIPS 03H) 
Department of the Navy 
Wa.;hington, DC 2036O 

1 Technical Library 

' Naval Ship Systems Comjnand 

National Center, Building 3 

Room 3^08 ' ' 

Washington, DC 2036O 

1 Commanding Officer 
Service School Command 
U.S. Naval Training Center 
" San Die^Q, CA 92133 
ATTN: Code 303 

1 Chief of Naval Training Support 

Code N-21 
' Building ^5 ]- 

Naval Air Station ^ 

Pensacola, FL 32508 

i Dr- William L. Maloy 

Principal Civilian Advisor _ _ 

for Education and Training 
• Naval Training Coirnnand, Code OlA 

Pensacola, FL 32508 

1 Mr. 'Arnold Rabinstein 

Naval Material Command (lMAT-03^2if) 
Room 8^0, Crystal Plaza No. 6 
Washington, DC 2036O / . 



ERIC 



Army 

1 Commandant 

U.S. AnTjy Institute of ^Administration 
Fort Benjv9min Harrison, IN ii62l6 
ATTN: EA 

1 Aimed Forces Staff College 
. Norfolk^ VA 235II 
ATT:'^: Library 

1 Director of Research 

U.S.^ Army Armor Human Research Unit . 
Building 2ii22, Morade Street 
Fort Knox. KY if 0121 
ATTN: Library 

1 U.So Aiitiy Research Institute for the 
Behavioral and Sccial Sciences- 
1300 Wilson Boulevard 
Arlington, VA^ 22209 ' 

1 Commanding' Officer & 
USACDC - ?A3A 

Yt. Benjamin Harri&on^ IN k62k9 
ATTN: LTC Montgomery 
{ 

1 Dr. John L» Kobrick 

Military Stress Laboratory 
U.S. Army Research Institute of 

Environmental Medicine 
NaticK^ MA OI76O 

i Commandant 

United States Array Infantry School 
Fort Benning^ GA 31905 
ATTN: ATSIN-H 

1 U.S. Ar-my ResearcH^nstitute 
Commonwealth Building^ Room 239. 
1300 V/ilson Boulevard 
Arlington VA 22209 - . ' 

. ATTN: Dr, R. Dusek 
) 

1 Mr. Edmund- F. Fuchs - • • 

U.S. Army Research Institute 
1300 Wilsjn Boulevard ' • 
Arlington^ VA ' 22209 • 



1 Chief, Unit Training and Educational 
Tecluiology Systems i 
U.S. Amriy Research Institute for the 

Behavioral and Social Sciences 
1300 Wilson Boulevard 
Arlington, VA 22209 

1 Commander 

U.S." Theater Army Support Command, 

•Europe 
APO 'New" -York 09058 
ATTN: Asst. DCSPER (Education)- 

1 Dr. Stanley L. Cohen 
Work Unit Area Leader 
Organizational Development Work Unit 
Army Research "Institute for Behavioral 

and Social Science 
1300 Wilson' Boulevard 
.Arlington, VA 22209 

1 Dr. .Leon H« Nawrocki 

U-S;, Army Research Institu^te 
Roiislyn Commonwealth Bui-lding 
1300 Wilson Boulevard 
Arlington, VA 22209 ' 

Air Force 

1 Headquarters, U.S. AirF^:K^e 

Chief, Personnel Research and Analysis 

.Division (AF/DPSY). ' . 
Washington, DC 20330 

1 Research and Analysis Division 
AF/DPXYR,, Room i^C200 
Washington, ^DC 20330 

1 AFHRL/AS (Dr. G. A. Eckstrand) 
Wright-Patterson^ AFB 
Ohio i^5^33 ' - 

1 AFHRL. tAST/Dr. Ross L. Morgan) 
Wright-Patterson AFB 
Ohio . ,.^5^33 

1 AFHRI/MD • . 

701 Prince Street 
Room 200 

Alexandria, VA 2231^^ 



1 AFOSR (JIL) - ■ " 

ikOO Wilson Boulevard: 

Arlington, VA 22209 ^ 

1 Cammafidaht 

USAF School of Aerospace Medicine 
Aeromedical Library (SUL-H) 
Brooks APB, TX 78-?35 

1 CAPT Jack Thorpe^ USAF 
Department of Psychology 
Bowling Green S^ate University 
Bowling Green ^ OH 4,3^^03 

1 Headquarters^ Electronic Systems Division 
LG Hanscom Field 
Bedford, M 01730 
ATTN: Dr. Sylvia P. Mayer/MCIT 

Marine Corps ^ ^ 

.1, COL George Caridakis 

Director^ Office of Manpower 

Utilization 
Heaii^uarters^ Marine...Ci3rps (AOIH) 
MCB ■ ^ ■ 

Quantico^ VA 2213I; . 

1 Dr. A, L. Slafkosky 

Scientific Advisor (Code Ax) 
Coimriandant of the Marine Corps • 
Vashxngton^ DC 2038O ^ ' 

1 Mfo E. A, Dover . 

Map.power Measurement Unit (Code AOlM-2) 
Arlington Annex^ Room 2^13 
Arlin^ton^ VA 20370- ■ . 

Coast Guard 

1 Joseph J. Cowan^ Chief 

Psychological Research Branch (P-l) ' 
U,S- Coa^t- Guard Headquarters 
i+00 -Seventh Street, 
Washington', DC 20590 

Other i>JD 

1 Lt, 'Col» Austin W. Kibler, Director ' 
. Kur.an Resources Research Office 
Advanced Research Projects Agency 
Q ikOO V/ilson Boulevajrd . * 



1 Dr. Helga Yeich, Director 

Program Man-^igement^ Defense Advanced 
* Research Projects Agency 

1^00 Wilson Boulevard 
■ Ar.Tington^ VA 22209 

1 Mr. William J. Stomier 
DOD Computer Institute 
Washington Navy Yard 
Building I75 
Washington^ DC 2037^ 

1 Mr. Ralph Ti. Carter 

Director for Manpower Reoearch^ 
Office of SecretF.ry of Defense 
The Pentagon^ Room 3C98O 
V/s^hington, DC 20301 

Other Government 

1 Office of Computer Infoimation 
Institute for Computer Sciences 

and' Technology ; 
National Bureau of Standards 
Washington^ DC 2023^^' 

Miscellaneous 

1 Dr, Scarvia Anderson 

Executive Director for Special 

Development * • ' / 

Educational Testing Sei-vice , ' 
Princeton, 085^+0 



1 Dr. Richard C Atkinson 1 
Department of Psychology 
Stanford University 
Stanford, CA 9^4305 



1 Dr. Bernard M. Bass 
Mana^'^ment Rasearch Center 
University of Rochester 
Rochester^ m 14627 

1 Mr. Edmund C. Berkeley 
Berkeley' Enterprises, Inc. 
815 Washingfjon l^treet 
Newt onvi lie', I-IA , 02l60 



lERJ^C^llngton, VA 22209 



?_ Dr. David G.. Bowers 
University of Michigan 
Institute for Social Research 
P.O. Box 12k8 
Ann Arbor, MI h8l06 

1 Mr. H. Dean Brovm 
Stanford Research Insti.tute 
333 Ravenswood Avenue 
Menlo Park^ CA 9^1025 

.1 Mr* Michael W. Brown 
Operations Research, Irx. 
1^0 ^ Spring Street 
Sil.er Spring;, MD 20910 

1 Dr* Ronald P. Carver 

American Institutes for Flesearch 
6555 Sixteenth Street 
Silver opring, MD 20910 

\ 

1 Century Research Corpord^tj^on 
1^113 J^e -Highway 
Arlington. VA 22207' . . 

1 Dr^ Kenneth E. Clark 
University of Rochester 
College of Arts and Scienoes 
River Campus Station 
Rochester, NY 1^627 

1 Dr. Allan M» Collins 
Bolt Beranek and I^ewman 
50 Moulton Street 
Cambridge, UA G2138 

1 Dr. Rene V» Dawis 
University of Minnesota 
Department* of Psychology 
Minneapolis^ M' 55^^55 

2 ERIC 

Processing and Reference Facility 
^833 Hugby Avenue 
Be the 3 da, MD 2001U 

1 Dr» Victor ;^ields 

Department of Psychology 
Montgomery Cclliege 
Rockville^ MD 2085O 



1 Dr. Edwin A.' Fleishman 

American Institut^es for Research 
8555 Sixteenth Sti\^et 
Silver Spring, 14D 20910 

1 Dr. Robert. Glaser, Director 

Learning Research ^ id Development 
Center 

University of p-^ tts burgh ' 
Pittsburgh, F/ 15213 ' 

1 Dr. Albert S/ Glickir.an 

American Institutes for Ffesearch 
8555 Sixteenth Street 
Silver Spring, MD 2O9IO 

1 Dr. Duncan N, Hansen 

Center for Computer-Ass^stecl 
Instruction 

Elorida State University 
Tallahassee, FL 32306 

1 Dr. Henry J. Hamburger 
University of California 
School of Social Sciences 
Irvine, CA 9266^1 

1 Dr. Richard S, Hatch 

Decision Systems Asr^'^ciates , Inc. 
11^28 Rockville Pike 
Rockville, MD 20852 

1 Dr. M. D, Havrbn - " ^ " 

Human Sciences Research, Inc. 
Westgate Industrial Park 
7710 Old Springhouse Road 
McLean^ VA 22101 

1 Human Resources Research Organizatic«n 
Division No, 3 
P.O. Box 5787 

Presidio of Monterey, CA 939^0 

1 Human Resources Research Organization 
iDiviston No. '^Infantry 
P.O.- Box 2086 
Fort Benning, GA 3I905 



^ Human Resources E.esearch Organization 
Division No. 3, Air Tefense 
P.O. Eox 6057 t^. 
Fort Bliss, TX 79916 ^ 

1 Huaian Resources Research Organization 
^ Division No. 6, Library 
P.O. Box h2S 
Fort Rucker, AL 3636O 

1 Dr, Lawrence B. Johnson 

Lavrence Johnson and Assooiat^^s, Inc. 
200 S Street, Suite 502 

Vasnington, DC''"'20O09 

1 Dr, r^crman J. Johnson 

School of Urban and PuVxic Affairs ■ 
Carnegie -Me lion University 
Pittsburgh, PA 15213 

1 Dr. David Klahr 

Graduate School of Industrial 

AdiP-inistration 
Qamegie-Mellon University 
Pittsburgh^ . PA I5213 

1 Dro Robert R. Mackie 

HujTian Fact on- Res'ear::h, Inc. 
6780 Cortona Drive 
Santa- Baroara. Rei^earch Park 
Gclera^ OA 93017 

1 Di. Ajidttjw H. Molnar 

Techriological Innovations in Education 
Nati'.^nal Science Foundation 
Waohington, DC 20550 

. 1 Dr * Leo Munday, Vice Preslde-it 
American College Testing Prcgram 
P.O. Box 168 
Iowa Ci^y, lA 52240 

■ 1 Dr. Donald A. Norman . 
• Center lor Human InforrnPtion Processing 
Urjiversity of California, San Diego 
La Jolla, CA 92037 , 

. 1 .Mr* Luxgi Petrullo 

2I43I North Edgevood Street 
Arlington, VA 22207 

ERIC 



1 Dr. Robert D. Pritchard 
Department of Psychology 
Purdue University 
L£.fayette, IN ii7907 

1 Dr. Diane M. Rainsey^-Klee 
R-K Research &• '^'vstem Design 
39^7 Ridgemont L-lve 
Malibu, CA ' 90265 

1. Dr, Joseph W. Rigney 

Behavioral Technology Laboratories 
University of Sou\\hern California 
3717 South Grand 
Lcs Angeles, CA 90OO7 

1 Dr. Leonard L. Rosenbaiijn, Chairman 
Department of Psychology 
Montgomery College 
Rockville, MD 2085O 

1 Dr. George E. Rowland 
Rowland and Company, Inc. 
P.O. Box 61 

Haddonf^eld, NJ G8033 

1 Dro Arthur 1/ Siegel 

Applied Psycholcjical Services 
•Science Center 
UO^ East Lancaster Avenue 
Wayne, PA I9087 

1 Mr, itennis J* Sullivan 
725 Benson Way 
Thou-a^nd Oaks., CA 9136O 

1 Dr. Benton J. Underwood 
Northwestern University 
Department of Psychology 
Evanston, IL .60201 

1 Dr. David J. Weiss 

Department of Psychology 
University of Minnesota 
Minneapolis, MN 55^55 

1 Dr. Anita West 

Denver Rrs^earch Institute 
University of Denver 
Denver, CO 80210 



1 Dr/ Kenneth" Wexle.r 

School of Social Sciences 
University of Calif ol-nie> 
Irvine. CA 9266^1 

1 Dr*;;John Annett 
Tht Open University 

Keynes 
BuoklnghamshTre^ ENGLAND 

1 Mai. P. J. DeT^o 

I-.istructiorAii Technology Eranch 
AF' Human Re3v>:arces Laboratory 
Lov.ry AFB; CO 80230 

1 Dr». Martin RockwL-' 

TechJjical Training Di\'lsion 
Lowry Air Force lase 
Denver^ CO 8023C 

1 Dr. Eric McWllliams, Program Manager 
Technology and Systems TIE 
National Science Foundation 
Washir'-ron^ DC 20550 ~ 

1 Dr. Milton Katz 
MITRE Corporation 
Westgate Research Center 
McLean, VA 22101 

1 Dr. Charles A. Ullmann-- 

Director^^ Behavioral Sciences v3tudies 
Information Concepts Incorjjorated — 
1701 No, Ft. Myrr Drive . " 
Arlington, ^A 22209 



(Continued from Inside front cover) 



96 ^. C. Atkinson, J. W. Brelsford, and R. fA* Shiffrin. Multl»process models for memwy with ippllcatloni to & contliHJOus prastntillon task. 

Aprli l3, 1966. CI. m«A. Psychol. , 1967, 4, 277-300 ). 

97 P. Suppes and £. Crothers. Some remarks on sttimilusfesponse theories o( langtafft learning. June U', 1966. 

98 R. BJork. AII-or*none subprocesses In the learning of coi^ex sequence*. <J. with . Ptychol. , I96B, t^, (82*195). 

99 £. Ga-ninoo. The sUtlstlcal determination of Hngulsttc units. July 1^19^6^ 

100 P. Suppes, L. Hyman, and M. Jerm&n. Linear structural models for response and latency performance In wlthmetic. in J. P. HItl Ced.), 

Minnesota Symposia on Child Psychology . Minneapolis, Minn.: 1967. Pp. f &0'200}. 

101 J. L. Young. Effects of Intervals bet«r£e<^1nforcements and test trials in pain d-assKlate learning. August 1 , 1966. 

102 H. A. Wilson. An investigation oF Stngulstic unit size In mexnory processes. Ajgust 3, 1966. 

103 J« T. Toamsend. Choice behavior In a cued«irw.^itlon task. August 8, 1966. 

104 W. H. Batcheldor. A mathematical analyi is of multi-level vcibai learning. August 9, 1966. 

105 H. A. Taylor. The dbscrving response In a cued psychophysical taslc. August ID, 1966. 

!% R. A. BJofk . Learning and short-term retention of paired associates in relation to specific sequences of Snterpresentatlon Intarfals. 
August M, 1966. 

107 R. C< Atkinson dmdR, M. Shiffrin. Soms Two-process models for memory. September 30, 1966. 

108 P. Suppev>^ and C. Ihrke. AcceleratSv«{ program In elementary-school mathematics— the tiilrd year. January 30, 1967. 
r09 P. Suppes aN* I. RoT^ynthal»hlin. Con^ formaUon by kindergarten children In a card-sorting task. February 27, l%7. 
i 10 R. C. Atkinson and R . M. Shiffrin. tturw^n memory: a proposed system and Its control processes; Mwch 2f , 19^7. 

t U . Theodore S. Rodgers. Linguistic considerations In the design of the Stanford computer-based curiculum In Initial .vadlng. June 1, 1967. 

1 12 Jack lyi. ICnjtson. Spelling drills using a compu te r-Mslsted Instructional system. June 30, 1967. 

I IS R. C. Atkinson. Instruction In initial reading under computer control: (he Stanford Project. Jtdy 14, r9t /. 

1 14 J. W. Brelsford, Jr. and R. C. Atkinson. Recall of palred-assoclates as a function >)f over, and covert ithearsal procedures. July 21, 1967. 

1 15 J. H. Stelxer. Some results concerning siA)Jectlve probability structures with semlordcrs. .August 1, 1967 

116 D. E. Runelhart. The effects of Inberpnseniatlcn Intervals on performance In a continuous pattifd^socfate tash. August II, 1967. 

117 E. J. risNnan, L. Keller, and R. C. Atkinson. Massed vs. distributed practice hcomputerlied spelling drills. August IB, 1967. 
1 IB G. J. Groen. An investigation of some counting algorithms for simple addition problems. August 21, 1967. 

1 19 H« '^Mson and R. C. Atkinson. Computtr-based instruction in initial reading: a progress report on the Stan/ord Project. Awiusi 25, I9«)7. 

120 F. S. Rob erts and P. Suppes. Sor« problems in the geometry of visual pereeptlon. August 31, 196;'. ( Synthe> » ^ (967, 173-201) 
12 J 0. Jamison. Bayeslati decisions under total and partial Ignorance. D. Jamison and J. Kozle1</;k{. Sibjecttvt prci^bllltlea under total 

uncertainty. September 4, 1967. 

122 R. C.^Atklnson. Coir^tsrlzed Instruction and the leamlc*« process. Septerri»er 15, 1967. 

123 W,.K. Estes. Outline >r a theory of punlshff«^. October 1, ;967. ^ 

124 . T. S. Rodgers. Measuring vocabulary difficulty: An analysis of lu?t wlables in leaning Rus^/ian-Engilsh and Jspanese-EngllsH vuabulary 

parts. Dec«nterlB, 1967, 

125 W. K. Estcs. Reinforcement (n human learning. Pectmber 20, 1967. 

126 G. L. V/ollord, D. L. Wessel, W. K. Estcs. Further evidence corKemIng scanning and sampling assumjitlons of visual detection 

models. January 31, 1968. 

127 R. C. .Mkinson and R» M. Shilfrln. Some speculations on storag« and retrieval processes In Inng'term memory. Febmary 2, l%8, 

128 ^ John Holmgren. Visual detection with Imperfect n^ognlUon. March 29, 1968. 

129 Lucille B. MIodnosky. The Fros ^ and the Bender GesUlt as predictors of reading achtevement. April |2 , 1968. 

130 P. Suppes. Soms theoretical mode.s math^mitlcs learning. April 1 5, I96R (JO;.rnal of Research and Development In Education . 
1967, 5-22) 

.131 G. M. Olson. Learning and retention in a continuous rccognttion usk. Way 15, 1968. r^" 

132 Ruth Moreno Hartley. An InvHtigstfon of list types and cues to facilitate Init^i reading VKabulary K^ulsitlon. May 71, I968» 

133 P. Suppes. StimuluS'^sponse theory of Unite automata. June 19, 1968. 

134 N. Moler and P. Suppes. Quantifier-free axlomci for constructive plane geometry. June 20, 1968. tin J. C. H. Gerret^n and 
F. Oflrt (Eds.^ Compositio Mathematica . Vol. 20. Gronlngen^ The Hetherlanis: V'olters-Noordhofr, 1968. Pp. 143-152.) 

135 W. K. Estes and D. T Horst«t Latency as a fdnctlon of number or response alternatives In paired -associate learning, .^uly I, 1968. 

136 M. ScMag^Rey and P. Suppes. High-order dimensions in concept identlficaUon. July 2, 1968. fPsychom . Sci., 1968, V, \Ah\A2) 

137 R« M. Shiffrin. Search and retrieval processes )n long-firm mtmofy, August 15^ 1968, 

138 R. D. Freund, G. R« Loftus, and R.C, Atkinson. Applications of multipr»:iss models fu ^memory to continuous reco^tition tasks. 

Decenber 18, 1968. , 

139 R. C. Atkinson, tnfonihiilon dela^in human learning. Oecember 18, iVisB. 

14 : 9., C. Atkinson, J. il. Holmgren, and J. F. Juola. Processing ilme as influenced by the number of elements in the visual display. i 
Match 14, (969^ ^ 

1^1 P. Suppe&^E.-lvrtoftus, andM. Jerman. iVoblem-sniving on a computer-4)ased teletype* l^rch 25, 1969. 

142 P. ^ lappet and Mona Mcrningstar. Evaluation of \Jhnt computer-assisted Instruction programs. May 2, 1969. 

(43 P« Suppes. On the problems of using mathematics In the development of the social sciences . May 1 2, 1969. 

f44 Z, Domotor. Probabllistjc relalicr^ strtctures and their applications. May 14, 1969. 

145 R -C. Atkinson and T. 0. Wickens. Human me%.~tvy and the concept of reinforcement « May 23, 1969. 

146 R. J. Titiev. Some model -theoretic result^ in meas.vei :ent theory. May 22, 196^. 

147 P. Suppes. Measurement: Problems ol theory and application. Junei2, l%9j^ ^ 

148 P« Suppes end C. Ihrke. Accelerated program In elementary^cho^ mLthemfctics— the fourth yrjur« August 7, 1969. 

149 0. Rundus and R.C. Atkinson. Rehearse; tn free rTcafl: A procedure for direct observation. Attgust 12,^1969. 
Q ^ P. Stapes and S. Fcldman. Young children'^ comor»h»nsion of logical coi.n*cl»*'»s. Ociober 15, Ii9b9. 




( Continued on Lack cover ) 



( Coittfnued from insjde back u rtf ) 



151 Joaquim H. Laubsch. An adaptive teaching system for optimal Kem allocation. November 14, 1969. 

152 Roberta L. KlaUkyfand Richard C. Atkinson. Meoiofy scans based on alternative test stimulus representations. November 25, 1969. 

153 JoM E. Holmgren. Response latency^as ar Indicant of infonnation processing In visual search usks. March 16, 197D. ^ 

154 Patrick Suppes. Probabilistic ijrammars for natural lar>9uaQes. May 15, 1970. 

155 E. Gammon. A synUctlcal analysis of some first-^rwie readers. June 22, 1970.. 

^6 Kenneth N. Wex!?f« Art automaton analysis of the Keam}: A a miniature system of Japanese. July 24^ 1 970. 

X. R. C. Atkinson and J ^. Paulson. An approach it'y\i psyehotogy of instruction. August 14, 1970. 

158 .A.C. Atkinson, J.D. Fietcher, H.C. Chetin, and CM. Stauffer. Instruction In initial reading under computer control: the Stanford project. 

Aug'.-sl 13, 1970. 

159 Dewey J. Rundus. An a>^alysis rehearsal processes in free re^^al!. August 21, 1970. 

160 R.L. Klatzky, J.F. Jujla, and HS, AUilnson. Test stimulus representation and experimental context effects In memory scanning. 

161 Wit} lam A. .^oUmayer. A forAiai tneoo* of perception. Novembr 3,1970. 

162 Elizabeth Jane Ffshman Loftus . An analysts of the structural ^bles that determtoe r?robf em-solving difficulty on a computer-based teletype. 
December 18, 1970. 

163 Joseph A. Van Campen. TO'vards the automatic generation n' programmed foreign-language instructional materials. January 11, 1971. 

164 Jamesine Friend and R.C. Atkinson. Computer-assisted^ ..>st jction in programming: AID. J>T(uary. 25, 1971. 

lt?5 Lawrence James Hubert. A formal model for the perceptual processing of geometric configurations. February 19, 1971. 

Ibb J. F. Juola, I.S. Fischler, C.T.Wood, and R.C. Atkinson. Recognition time>irTnfwTnation stocted In iong-terr '"^moryV 

'*f7 R.L. Klatiky and R.C. Atkinson. Specialization of the cerebral hemispheres In scanning for Informff'tion in short-u.. . memory. 

1 j8 J.D. Fletcher and R.C. Atkinson; An evaluation of the SUnford CAI program in initial reading ( grades K through ? \ March 12, 1971. 

169 James F. Juoiaand R.C. Atklrjson. Memory scanning for words versus categories. 

1 70 Ira S . Fischler and James F. Juola. Effects of repeated tests on recognition time for information in long-teim memory. 

171 Patrick Slopes. Semantics of context-free fragments of natural languages. March 30, 1971 . 

172 Jame sine friend. Instruct coders* manual. May 1, 1971. 

173 R.C. Atkinson and R.M. Shiffr In. The control processes of short-term memory. April 19, 1971. 

174 Patrick Suppes. Computer-assisted instruction at Stanford. May 19, 1977.. 

175 0. Japsison, J.D. Pletcher, P. Suppes^and R.C. Atkinson. Cost and erfnrm ,nce of computer-assisted instructior ^or compensatory education. 
17t Joseph Offtr. Some mathemat^ : - els of :-^.^ividual differences tn learning and performance. June 28, 1971 . 

177 Richard C. Atkinson and Jam^^ ' juola. K«;tors influencing speed and accuracy of word recognition. August 12, 1971. 

178 P. Suppes, A. Goldberg, G. Kanz, B. Searle^nd C. Stauffer. Teacher'* handbook for CAt courses. Sepleinber 1, 1971. 

179 Adele Ge^ldoerg. A generalized instructional system ?0f elementary mathematical logic. October 11, 1971. 

180 Max Jerman. Instruction In problem solving and an analVsIs of structural variables that contribute to problem-solving difficulty. NoveiiAer 12, 1971. 

181 Palrick Suppes. On the grairvnar and model -theoretic semantics of children's noun phrases. November 29, 1971 . 

182 Georg Krelsel. Five notes on the applies* io«i of proof theory to computer science. December 10, 1971. 

183 James Michael ^ioloney. An lnvestlg&%..>i of college student performcnee on a logic curriculum In a computer-assisted Instruction setting. 
January 28, 1972. 

184 J.E. Friend, J.D. Fletcher^ and R.C. Atkinson. Student performance rn cu^uter-assisted instrui tlon in programming. May 10, 1972. 
105 Robert Lawrence Smith, Jr. The synUx and semarrtlcs ol ERICA. June 14, 1972. 

166 Adele Goldberg and Patrick Suppes. A computer-assisted instruction program for exercises on finding axioms. June 23, 1972. 

187 Richard C. Atkinsr^. Ingredients for a theory of Instructioni June 26, 1972, 

188 John D. Bonvilllan :^nd V-via R. Charrow. Psychoiinguistic Implications of dealness* A review. July 14, 1972. 

189 Phipps Arable and Scott A Boorman. Multidimensional scaling of measures oi distance bebween partitions. July 26 ^ 1972. 

190 John Bail ^nd Dean Jamisu Computer^asslsted Instruction for dispersed populations: System cost models, ^^eptember 15, 1972. 

191 W. R. S; .jrs and J. R. Ball. Logic documenUtion sUndard (or the Institute For Mathematical Studies in the Social Sciences. 
October^ ,1972. ^ 

192 il.T. Kane. Variability In the proof behavior ol college students in a CAI course in logic as a funr.tion of problem characteristics. 
October 6, 1972. 

X93 Pv Suppes. facts and fanUsle^ of education. October 1^, 1972. 

194 R. C. Atkinson and J. F. Juola. Cmch and decision v*^ocesses in reco9nition memory . .October 27, 1972. t 

195 P. Suppes, R. Smith , and M. Le'vellle. Ths Trench synUx and semantics of PHILIPPE, part 1 : Noun phrases. Noverrtbcr 3, 1972. 

196 D. Jamison, P. Suppes, andS. Welis. The effectiveness of alternative instructional methods: A su^'^. Noveirter 1972. 

197 P. Suppes. A survey of cognition in handicapped cliiidren. December 29, 1972. « 

195 B. Seirle, P. Lorton, Jr., A. GokVrg, P. SuPPes, N. Ledet, and C. Jones. Computer-assiited instruction program: Tennessee State University. 
February 14, 1973. 

199 D. Levine. Compuier-based analytic grading for German grammar Instruction. March 16, 1.9/3. 

200 P. Suppes, J.D. Fletcher, M. Zanottl, P. V. Lorton, Jr., and B. W. Searle. Evaluation it computer -assisted instruction in efementary 

mathematics' for hearing -impaired studenti. March 17, 1973. , 

201 G. A. Huff. Geometry acid formal linguistics. AprU 27, 1973. 

202 C. Jensema. Useful techniqi*es for apolyino iatent trait merrlal-test theory. May 9, 1973. 

202» A. Goldberg. Computer-assisted Instruction: The application of th*orem« proving to. adaptive resii\)nse analy&ts. May 25, 1973. 

204 R. .C. Afkinson, D. J. Herrmann, and K. T. We:kCourt. Search pfixcsses m riicoynitton memory. Juno 8, 1973, 

205 J« Van Campen. A computer-based introduction to tits morphology of Old Church Slavonic.. June 18,1973 : 

206 R. B* Kimbad. Self^ptimizlng comouter-assisied tutoring: Theory iind practice, v^unc 25, 1973. 

207 R.C. AtJclnson, J.D. Fletcher, E. Lindsay^J.O. Campbell, and A. Barr. Computer-assisted mi^trucnon in initial readino. July 9, 1973. 

208 V. R. CharroM and J. 0. Fletcher. Enghshas the second language of d^af students. Juiy 20. 1<)73. 




A. PaufsoTN. An evaluation of instructional strategies in a timpis learning siluat40n. July 30, 3 973. 



